Thursday, July 27, 2017

talk.origins evolves

The newsgroup talk.origins was created more than 30 years ago. It's been a moderated newsgroup for the past twenty years. The moderator is David Greig and the server, named "Darwin," has been sitting in my office for most of that time. I retired on June 30th and my office is scheduled for renovation so Darwin had to move. Another complication is that the moderator is moving from Toronto to Copenhagen, Denmark.

So talk.origins evolves and the server is moving elsewhere. Goodby Darwin.



Friday, July 14, 2017

Bastille Day 2017

Today is the Fête Nationale in France known also as "le quatorze juillet" or Bastille Day.

This is the day in 1789 when French citizens stormed and captured the Bastille—a Royalist fortress in Paris. It marks the symbolic beginning of the French revolution although the real beginning is when the Third Estate transformed itself into the National Assembly on June 17, 1789 [Tennis Court Oath].

Ms. Sandwalk and I visited the site of the Bastille (Place de la Bastille) when we were in Paris in 2008. There's nothing left of the former castle but the site still resonates with meaning and history.

One of my wife's ancestors is William Playfair, the inventor of pie charts and bar graphs [Bar Graphs, Pie Charts, and Darwin]. His work attracted the attention of the French King so he moved to Paris in 1787 to set up an engineering business. He is said to have participated in the storming of the Bastille but he has a history of exaggeration and untruths so it's more likely that he just witnessed the event. He definitely lived nearby and was in Paris on the day in question. (His son, my wife's ancestor, was born in Paris in 1790.)

In honor of the French national day I invite you to sing the French national anthem, La Marseillaise. An English translation is provided so you can see that La Marseillaise is truly a revolutionary call to arms. (A much better translation can be found here.)1



1. I wonder if President Trump sang La Marseillaise while he was at the ceremonies today?

Check out Uncertain Principles for another version of La Marseillaise—this is the famous scene in Casablanca.

Reposted and modified from 2016.

Revisiting the genetic load argument with Dan Graur

The genetic load argument is one of the oldest arguments for junk DNA and it's one of the most powerful arguments that most of our genome must be junk. The concept dates back to J.B.S. Haldane in the late 1930s but the modern argument traditionally begins with Hermann Muller's classic paper from 1950. It has been extended and refined by him and many others since then (Muller, 1950; Muller, 1966).

Thursday, July 06, 2017

Scientists say "sloppy science" more serious than fraud

An article on Nature: INDEX reports on a recent survey of scientists: Cutting corners a bigger problem than research fraud. The subtitle says it all: Scientists are more concerned about the impact of sloppy science than outright scientific fraud.

The survey was published on BioMed Central.

Tuesday, July 04, 2017

Another contribution of philosophy: Bernard Lonergan

The discussion about philosophy continues on Facebook. One of my long-time Facebook friends, Jonathan Bernier, took up the challenge. Bernier is a professor of religious studies at St. Francis Xavier University in Nova Scotia, Canada. He is a card-carrying philosopher.1

The challenge is to provide recent (past two decades) examples from philosophy that have lead to increased knowledge and understanding of the natural world. Here's what Jonathan Bernier offered.
But to use just one example of advances in philosophical understanding, UofT (specifically Regis College) houses the Lonergan Research Institute, which houses Bernard Lonergan's archives and publishes his collected works. Probably his most significant work is a seven-hundred-page tome called Insight, the first edition of which was published in 1957. It is IMHO the single best account of how humans come to know anything that has ever been written. The tremendous fruits that it has wrought cannot be summarized in a FB commend. Instead, I'd suggest that you walk over and see the friendly people at the LRI. No doubt they could help answer some of your questions.
Here's a Wikipedia link to Bernard Lonergan. He was a Canadian Jesuit priest who died in 1984. Regis College is the Jesuit College associated with the University of Toronto.

Is Jonathan Bernier correct? Is it true that Lonergan's works will eventually change the way we understand learning?


Note: In my response to Bernier on Facebook I said, "I guess I'll just have to take our word for it. I'm not about to walk over to Regis College and consult a bunch of Jesuit priests about the nature of reality." Was I being too harsh? Is this really an examples of a significant contribution of philosophy? Is it possible that a philosopher could be very wrong about the existence of supernatural beings but still make a contribution to the nature of knowledge and understanding?

1. Jonathan Bernier tells me on Facebook that he is not a philosopher and never claimed to be a philosopher.

Monday, July 03, 2017

Contributions of philosophy

I've been discussing the contributions of philosophy on Facebook. Somebody linked to a a post on the topic: What has philosophy contributed to society in the past 50 years?. Here's one of contributions ... do you agree?
Philosophers, historians, and sociologists of science such as Thomas Kuhn, Paul Feyerabend, Bruno Latour, Bas van Fraassen, and Ian Hacking have changed the way that we see the purpose of science in everyday life, as well as proper scientific conduct. Kuhn's concept of a paradigm shift is now so commonplace as to be cliche. Meanwhile, areas like philosophy of physics and especially philosophy of biology are sites of active engagement between philosophers and scientists about the interpretation of scientific results.


Sunday, July 02, 2017

Confusion about the number of genes

My last post was about confusion over the sizes of the human and mouse genomes based on a recent paper by Breschi et al. (2017). Their statements about the number of genes in those species are also confusing. Here's what they say about the human genome.
[According to Ensembl86] the human genome encodes 58,037 genes, of which approximately one-third are protein-coding (19,950), and yields 198,093 transcripts. By comparison, the mouse genome encodes 48,709 genes, of which half are protein-coding (22,018 genes), and yields 118,925 transcripts overall.
The very latest Ensembl estimates (April 2017) for Homo sapiens and Mus Musculus are similar. The difference in gene numbers between mouse and human is not significant according to the authors ...
The discrepancy in total number of annotated genes between the two species is unlikely to reflect differences in underlying biology, and can be attributed to the less advanced state of the mouse annotation.
This is correct but it doesn't explain the other numbers. There's general agreement on the number of protein-coding genes in mammals. They all have about 20,000 genes. There is no agreement on the number of genes for functional noncoding RNAs. In its latest build, Ensemble says there are 14,727 lncRNA genes, 5,362 genes for small noncoding RNAs, and 2,222 other genes for nocoding RNAs. The total number of non-protein-coding genes is 22,311.

There is no solid evidence to support this claim. It's true there are many transcripts resembling functional noncoding RNAs but claiming these identify true genes requires evidence that they have a biological function. It would be okay to call them "potential" genes or "possible" genes but the annotators are going beyond the data when they decide that these are actually genes.

Breschi et al. mention the number of transcripts. I don't know what method Ensembl uses to identify a functional transcript. Are these splice variants of protein-coding genes?

The rest of the review discusses the similarities between human and mouse genes. They point out, correctly, that about 16,000 protein-coding genes are orthologous. With respect to lncRNAs they discuss all the problems in comparing human and mouse lncRNA and conclude that "... the current catalogues of orthologous lncRNAs are still highly incomplete and inaccurate." There are several studies suggesting that only 1,000-2,000 lncRNAs are orthologous. Unfortunately, there's very little overlap between the two most comprehensive studies (189 lncRNAs in common).

There are two obvious possibilities. First, it's possible that these RNAs are just due to transcriptional noise and that's why the ones in the mouse and human genomes are different. Second, all these RNAs are functional but the genes have arisen separately in the two lineages. This means that about 10,000 genes for biologically functional lncRNAs have arisen in each of the genomes over the past 100 million years.

Breschi et al. don't discuss the first possibility.


Breschi, A., Gingeras, T.R., and Guigó, R. (2017) Comparative transcriptomics in human and mouse. Nature Reviews Genetics [doi: 10.1038/nrg.2017.19]

Genome size confusion

The July 2017 issue of Nature Reviews: Genetics contains an interesting review of a topic that greatly interest me.
Breschi, A., Gingeras, T. R., and Guigó, R. (2017). Comparative transcriptomics in human and mouse. Nature Reviews Genetics [doi: 10.1038/nrg.2017.19]

Cross-species comparisons of genomes, transcriptomes and gene regulation are now feasible at unprecedented resolution and throughput, enabling the comparison of human and mouse biology at the molecular level. Insights have been gained into the degree of conservation between human and mouse at the level of not only gene expression but also epigenetics and inter-individual variation. However, a number of limitations exist, including incomplete transcriptome characterization and difficulties in identifying orthologous phenotypes and cell types, which are beginning to be addressed by emerging technologies. Ultimately, these comparisons will help to identify the conditions under which the mouse is a suitable model of human physiology and disease, and optimize the use of animal models.
I was confused by the comments made by the authors when they started comparing the human and mouse genomes. They said,
The most recent genome assemblies (GRC38) include 3.1 Gb and 2.7 Gb for human and mouse respectively, with the mouse genome being 12% smaller than the human one.
I think this statement is misleading. The size of the human genome isn't known with precision but the best estimate is 3.2 Gb [How Big Is the Human Genome?]. The current "golden path length" according to Ensembl is 3,096,649,726 bp. [Human assembly and gene annotation]. It's not at all clear what this means and I've found it almost impossible to find out; however, I think it approximates the total amount of sequenced DNA in the latest assembly plus an estimate of the size of some of the gaps.

The golden path length for the mouse genome is 2,730,871,774 bp. [Mouse assembly and gene annotation]. As is the case with the human genome, this is NOT the genome size. Not as much mouse DNA sequence has been assembled into a contiguous and accurate assembly as is the case with humans. The total mouse sequence is at about the same stage the human genome assembly was a few years ago.

If you look at the mouse genome assembly data you see that 2,807,715,301 bp have been sequenced and there's 79,356,856 bp in gaps. That's 2.88 Gb which doesn't match the golden path length and doesn't match the past estimates of the mouse genome size.

We don't know the exact size of the mouse genome. It's likely to be similar to that of the human genome but it could be a bit larger or a bit smaller. The point is that it's confusing to say that the mouse genome is 12% smaller than the human one. What the authors could have said is that less of the mouse genome has been sequenced and assembled into accurate contigs.

If you go to the NCBI site for Homo sapiens you'll see that the size of the genome is 3.24 Gb. The comparable size for Mus musculus is 2.81 Gb. That 15% smaller than the human genome size. How accurate is that?

There's a problem here. With all this sequence information, and all kinds of other data, it's impossible to get an accurate scientific estimate of the total genome sizes.


[Image Credit: Wikipedia: Creative Commons Attribution 2.0 Generic license]