Cycling Scotland end-to-end over 3 days

On Saturday I’ll be starting the first of three days cycling the length of Scotland. It’s something I’ve always wanted to do and will serve as a “goodbye for now” as I move to the Land of Eng later in the month. The primary reason I’m doing this is to raise funds for a great charity, the Scottish Association for Mental Health (SAMH). They provide great support for those in need and also tackle the stigma of mental health. One in four people in the UK have to deal with mental health problems.

Various Scottish poems have mentioned the journey from John o’Groats to Kirkmaiden, which is generally considered to be the end-to-end journey for mainland Scotland. Robert Burns wrote of the journey “frae Maidenkirk tae Johnnie Groats” in his poem On the Late Captain Grose’s Peregrinations Through Scotland. Robert Louis Stevenson wrote:

But maistly thee, the bluid o’ Scots,
Frae Maidenkirk to John o’ Grots,
The king o’ drinks, as I conceive it,
Talisker, Isla, or Glenlivet!

The Maidenkirk they speak of is now Kirkmaiden. Although being in Scotland, it’s actually further south than Carlisle. Day 1: Leaving John o’Groats and cycling 160 miles through Inverness, along Loch Ness, and staying the night near Loch Oich. Day 2: Cycling through Fort William, through the Trossachs and Loch Lomond National Park, then on to Glasgow to stay the night. Day 3: Cycling as far south as I can get until I’m pretty close to Northern Ireland! I’ll post an update after the cycle with photos if possible.

If you would like to support SAMH you can donate on my Justgiving page for the ride. Thank you!

Best of “Not New Scientist”

I made a light-hearted Twitter bot to poke a bit of fun at New Scientist. @NS_headlines attempted to automatically create sensationalist and surreal headlines. The account has posted over 1000 tweets now, so I feel it’s a good time to switch it off. To the 845 followers who I know must be desperate for it to continue, I might improve the script in the future and restart it, but I’m more interested in other side-projects just now. I felt it might be worth making a short Best Of list to see some of what the account was able to come up with. The account was featured on Buzzfeed so there is a collection of favourites there, but here are a few more that were highlighted by Favstar as being popular.

GenomeTweet – Yeast is finished!

It’s done.

Finally, after 98345 tweets, the entire Saccharomyces cerevisiae genome is on Twitter.

S. cerevisiae is a yeast commonly used in brewing, wine-making, baking, and biological research in the lab. Now we have virus (HIV), prokaryote (Escherichia coli), and eukaryote (S. cerevisiae) genomes on Twitter.

HIV – 70 tweets

E. coli – 34767 tweets

Yeast – 98345 tweets

People who use Twitter can appreciate how much data can be stored in a tweet. A single tweet has a limit of 140 characters. When I see that tweeting the E. coli genome took  34767 tweets, I understand that it took me about 4 years to tweet that many times. I can visualise and appreciate this better than learning “a genome would fill a few squintillion books”. What books? How big are the books? How thick? How many pages? I can’t visualise it because I’ve never seen that many books at once. Actually, is squintillion a thing? But I can compare these genomes with my own Twitter output and appreciate the size a bit better. It’s not perfect, but hey I was just bored one evening and thought tweeting genomes could be interesting.

I’m now taking a break from tweeting genomes, at least for a while, so my Raspberry Pi is free to be used as a brain by Deckard The Robot. He’ll be coming to life over the next few months.

Homology

I’ve been having a discussion where the term “homology” was thrown around very loosely by someone who should probably know better. Considering deep homology and gene co-option are among my favourite topics in biology, I obviously could feel a rant coming on. Rants are often transformed into blogs these days. Those who have at least heard of the word might think it has something do with things being similar. Those with an interest in biology, especially evolutionary biology, probably realise it’s more complicated than superficial similarity. Sadly, I see the word used in various ways all the time and thought I’d share some thoughts. The word “homology” has been used for over 150 years and has meant different things to different biologists, with similarity of characters being the common theme. Most modern biologists use the word to refer to similarity that is due to common ancestry: two characters are homologous if they are derived from the same ancestral character in their most recent common ancestor. Any characters that exist in related lineages can be assessed for homology, including genes, chromosomes, genomes, cells, limbs, regions of the brain, behaviours, and the developmental programmes that result in these characters. Ancestors are rarely available for examination so homology is usually an evolutionary hypothesis rather than a direct observation.

Even among biologists, the correct definition of homology is still occasionally an issue. Some researchers write of “functional homology” when describing similar functions of traits. Some examples of supposed “functional homology” will be truly homologous in the sense of common ancestry, while others will be non-homologous but convergently similar. In some of the literature, homology still refers to characters that are merely similar, regardless of ancestry. If homology is defined by similarities, there may be a gradient of homology for any given character. Some characters are presumably more similar than others. If they are a slightly similar, are they only slightly homologous? If they are very similar, are they very homologous? Where do we draw the line? When do two similar traits become similar enough to warrant being described as homologous? This subjective issue is avoided entirely when common ancestry is used to define homology. We may not know for certain if two characters are homologous, but they either are or aren’t. This approach makes the concept of homology simpler to define and have researchers agree upon, but requires rigorous investigation to determine if homology exists in any given character.

homoplasy

Homology versus Homoplasy: Image taken from Hall (2003). This image demonstrates the homologous and homoplastic relationships of a character, C. B represents the plesiomorphic state of the character. The Cs in clade 1 are homologous because C already existed in the most recent common ancestor of the two lineages. The same situation is observed in clade 2. But when comparing Cs between clades 1 and 2, the relationship is homoplastic as C did not exist in the most recent common ancestor (3). Lineages 1 and 2 independently evolved the C character from the ancestral B character.

Understanding ancestral relationships in any kind of comparative biology usually involves recognising the differences between homology and homoplasy (as in the figure above). Homology is similarity because of common descent and ancestry. Homoplasy is similarity because of independent convergent evolution. Definitions must be clear. Some related characters are orthologues, arising from lineages splitting and diverging. Others are paralogues, arising from gene duplication. Some genes can also be xenologues if they have arisen from horizontal gene transfer. Understanding homology is essential in comparative biology because of the practical applications of such knowledge. Homology can be used in constructing character matrices for phylogenetic analyses. Also, finding functionally equivalent orthologues of human genes in model organisms has an important role in medical research. A geneticist studying fly orthologues of our genes needs to be sure that he/she has the correct homologue. The same can be said for a medical researcher studying human orthologues in mice that may influence the likelihood of getting cancer or Alzheimer’s disease. It is vital that evolutionary biologists understand what is truly homologous.

One level of homology is that of genes. When genes are replicated, their daughters can undergo independent evolutionary change much like individual organisms can. Phylogenetic analysis is as possible on individual genes as it is on species. Because genes can replicate, either within the same genome (paralogy) or because of a speciation event (orthology), divergent genes can evolve independently but they are homologous due to their common ancestry. Homology doesn’t only occur at the level of genes. Over generations, phenotypes can change considerably. Morphological characters in different species are homologous if they arose from an ancestral state. They may be highly derived and superficially unrecognisable as homologues, they may even have novel functions, but the modern definition of homology is concerned with their relationship with one another rather than superficial or functional similarity. After agreeing on the concept of homology by common ancestry, it’s a relatively simple concept to understand when considering a single level, e.g. a morphological character or a gene. Homology is simply the continuation of characters. The complications arise when the genetic and morphological levels of homology are integrated. Developmental genetics involves understanding the relationship between morphological characters and their genetic basis.

The modern evolutionary synthesis reconciled genetics and the evolution of morphology (and other phenotypic traits such as physiology, behaviour etc) by natural selection. But before the influence of modern evo-devo, developmental was relatively poorly understood compared to traditional genetics and was seen as a black box that transforms the genetic information into three-dimensional, morphological structures. In the last two decades, evo-devo has replaced this black box development with an appreciation of the mechanisms responsible for generating morphological structures from genetic information. How genes are used in development is as important as what genes are available, and lineage-specific differences can come about due to changes in spatial or temporal expression of genes as well as by the evolution of the genes themselves. Development is complex, often involving many genes influencing the expression of each other, and highlights important information about homology. Developmental mechanisms may be conserved even if complete structures don’t form in some species (rudiments and vestiges) and can differ even for structures that are homologous. This suggests that there is a third level to consider, between genes and morphology (or other characters of the phenotype). Can entire gene regulatory networks be homologous? Does this have implications for the relationship between genes and morphology? How can we identify true homologues if there is a disassociation between the genotype and the phenotype? These are questions I find fascinating.

Disassociation between genotype and phenotype

Wagner argued that homology at the levels of genetics and morphology are similar, as morphological characters are equivalent to genetic loci. Just as there may be different alleles present for a gene in a population, there may be different states for a morphological character. A gene and a morphological character can be duplicated during a speciation event. The gene would be an orthologue. The morphological equivalent would be a bat’s wing and a cat’s anterior legs, which are homologous characters in related species. But duplications can also occur within a species. Gene duplication can create paralogous genes. These genes are certainly homologous and have a common ancestor, but both descendents occur in the same genome. The morphological equivalent would be when morphological characters become repeated, such as teeth or extra limbs.

It is reasonable to expect that the genetics of a morphological character can evolve and thus evolve the morphological character itself. Therefore, if a morphological character has evolved, it must be because the underlying genetics have evolved. When homology is applied to phenotypic characters (e.g. morphological structures, behaviours, modes of communication), those characters existed in the last common ancestor. So both levels can be thought of as equivalents of one another and both are relatively simple to appreciate conceptually. Indeed, it isn’t surprising that similar features persist over evolutionary time and in multiple species (homology), especially if the developmental basis of that feature has also been conserved. It also isn’t surprising that different selection pressures can bring about similar features in organisms that do not share a most recent common ancestor (homoplasy). The more surprising observation is that homologous features can be formed from non-homologous developmental processes, and homologous developmental processes can be found forming non-homologous features. It is the relationship between the two levels that complicates our understanding and makes this such a strange issue.

Thinking at two levels of homology (morphological characters and the genes involved in their development), it appears to be a paradox. It doesn’t make intuitive sense that homologous morphological characters are brought about by the expression of non-homologous genes. It is not difficult to imagine a situation where this paradox causes two biologists to disagree over the supposed homology of a morphological character. If one relied on comparing gene expression between species, and the other relied on bone structure or another morphological feature, the paradox could confuse matters. A careful approach considering multiple lines of evidence is clearly required, but which lines of evidence? Is it as simple as genes vs morphology? The relationship between genotype and phenotype is remarkably complex. Developmental processes can evolve independently yet result in the same phenotypic character. This disassociation between the genotype and phenotype has been referred to as “phenotypic drift” or “developmental system drift”. Such a disassociation through evolution can make the search for homologous characters difficult. It can be easy to mistake morphological characters as being homologous just because homologous genes are involved in their development. Inversely, truly homologous morphological characters may be overlooked if it is realised that their genetic or developmental bases are different. It is also important to remember that genes do not operate in isolation. Researchers must consider networks of genes and the role they play in the development of morphological structures.

Homologous genes and non-homologous phenotypic characters

There are many examples of homologous genes being used in the development of non-homologous phenotypic characters. Most developmental regulatory genes of metazoans are more ancient than their developmental roles are. Homeobox-containing genes predate the origin of metazoans yet are often involved in patterning phenotypic structures that are unique to metazoans. Clearly their roles in development have evolved over time with new roles being gained and old roles being lost in some lineages. The segmentation in Drosophila melanogaster, Schistocerca americana and Aphidius ervi is putatively homologous, yet there are genes essential for segmentation in the fruit fly that play an entirely different role in the locust and wasp. The genes fushi tarazu and even-skipped are pair-rule genes in the fly, which divide gene expression into half-segments of the embryo. In the locust and wasp, these genes are involved in the development of the central nervous system rather than body segmentation.

It is a recurring theme that homologous transcription factors can have different roles in different taxa. Orthologues of distal-less, engrailed, and orthodenticle in echinoderms pattern different morphological features than they do in arthropods and chordates. In arthropods and chordates, distal-less is expressed during limb outgrowth and plays a role in proximodistal patterning, engrailed is involved in neurogenesis in the central nervous system, and orthodenticle has a role in the specification of anterior structures. In most echinoderms, distal-less and orthodenticle are expressed in the podia and engrailed is involved in skeletogenesis. But evolution has altered the expression and roles of these genes even among echinoderms. In the Asteroidea (sea stars), distal-less is expressed in the larval brachiolar arms. In the Echinoidea (sea urchins), engrailed is involved in rudiment invagination. In the Holothuroidea (Sea cucumbers), orthodenticle is expressed in the larval ciliated band. These changes in expression and role correlate with novel morphological features such as brachiolar complex of sea star larva or the sea urchin’s rudiment ectoderm invagination. Pre-existing genes have been co-opted for new roles in echinoderms.

Regulatory genes rarely have one role in a developing organism. The Notch signalling pathway is highly conserved and found in all metazoans. In Drosophila melanogaster, it is used in the development of wings, ommatidia, and bristles. These morphological structures are clearly not homologous, yet their development has common genetic features. Throughout the Metazoa, the Notch pathway can be found in the development of characters as diverse as feathers and T-lymphocytes. True conservation also occurs, such as the Hox genes and their role in patterning the anteroposterior axis in animals as different as fruit flies and humans. But these genes often have multiple roles. Although one role can be highly conserved, often there are divergent unique roles for these genes in different lineages.

Non-homologous genes and homologous phenotypic characters

Instead of homologous genes having roles in producing non-homologous morphologies, some homologous morphological characters are produced by non-homologous genes. Sex-lethal is a master regulatory gene that controls sex determination in Drosophila melanogaster. In other dipterans such Ceratitis capitata and Musca domestica, Sex-lethal exists but isn’t used in sex determination and is expressed during a different stage of development. Phylogenetic analysis suggests that the role in sex determination is the derived condition. Where even-skipped was co-opted to be used in the development of a novel morphological feature, Sxl has become involved in a developmental process that already existed. Sex determination in the Drosophila lineage existed before Sxl.

In most tetrapods, programmed cell death separates digit primordia during embryonic development. This creates interdigital space, allowing the primordia to develop into individual digits. In urodele amphibians, differential growth of the digits separates them, without apoptosis creating interdigital space. As a morphological feature, the digits of urodeles and other tetrapods are homologous. But the developmental processes and the genetics controlling those processes are not homologous. This phenomenon of homologous phenotypes being generated by non-homologous developmental processes is not restricted to adult morphology. In vertebrate embryos, the gastrula stage is considered to be homologous. However, it is found that very different developmental processes produce the gastrula in different vertebrate taxa.

Levels of homology

By revealing that development itself evolves, evo-devo implies that homology should be understood in a hierarchical fashion as there are several levels of homology. Homology at one level might not correspond to homology at other levels. As already discussed, two species may have homologous limbs, but the developmental processes that produce the limb, or the genetic cascades underlying those processes, may be different. For example, formation of the neural crest can occur by delamination or by cavitation, and gastrulation can occur via a blastodisc or a blastopore.

Some researchers have interpreted similar patterns of regulatory gene expression alone as evidence that morphological structures are homologous. This ignores the idea that homology may exist at several levels and it limits the evidence to a single source. Assuming that similar gene expression identifies homologous structures ignores the evolutionary histories of the structures and the regulatory genes. What exactly is homologous in a given example? The genes? Their expression patterns? Their developmental roles? The morphological structures that arise because of them? Because some of these levels can be homologous while others aren’t, mistakes can be made when expression data alone is used to assign homology to structures. At least three levels of homology and homoplasy must be considered: genes, developmental processes, and the resulting phenotypic character.

How can a morphological character (like segmentation) be homologous if different genes are involved? The answer lies in understanding developmental genetics and gene regulatory networks. Developmental processes can create different features in different organisms because they can be co-opted for new roles and old pathways can resurface or remain unexpressed, perhaps to be co-opted in the future. Wagner proposed that the homology of morphological characters is related to the continuity of gene regulatory networks (GRNs) rather than the expression of individual homologous genes. He refers to these networks as “character identity networks” (ChINs) and argues that they are what enables the execution of character-specific developmental programmes. In insect segmentation, more variation is seen in the homologous genes that are further upstream than downstream. Gap genes and pair-rule genes are higher in the segmentation hierarchy yet show more variation than lower genes such as the segment-polarity genes. Only the Diptera possess the gap gene bicoid and not even all members of the Diptera. Other segmented insects use different genes at this level of the segmentation hierarchy. But downstream GRNs are more conserved between taxa. Most if not all insects use engrailed and wingless as segment-polarity genes.

Generalising the insect segmentation data, Wagner argued that it is the most downstream regulatory networks, the ChINs, controlling the development of morphological characters that specifies the identity of the character. If homologous morphological structures are controlled by homologous ChINs, this would explain the paradoxical relationship between morphology and genes. The use of different genes in developmental programmes for homologous morphological characters could be explained by homologous ChINs co-opting different individual genes (or pathways) independently. A kernel is a highly conserved GRN. The term ChIN is instead concerned with GRNs that execute a character-specific developmental programme. Some kernels will be ChINs, but not all, as both terms were created for different reasons. One is concerned with conservation and age, the other with the relationship between the GRN and its ability to program character identity. Homologous ChINs can be very conserved, but can also co-opt different transcription factors in their regulation.

So basically…

The complex evolutionary relationship between genotype and phenotype provides two important messages. Firstly, as useful as gene expression data has been, it isn’t sufficient for diagnosing homologous morphological structures. Notch signalling doesn’t suggest that our T-cells and Drosophila eyes are homologous. Regulatory genes have multiple expression domains and play multiple roles in development. Also, it has been assumed that novel structures require novel genes or at least alleles. But how could new alleles or genes become established in a population before they produce an advantageous phenotype? Developmental genes and their ability to have multiple roles suggests an answer to this question. Genes can already exist in a population as new roles evolve and provide fitness advantages for individuals, and potentially the population, given time. Because developmental genes gain and lose roles, some morphological novelties presumably arise by co-opting pre-existing developmental genes for new roles. The echinoderm morphological novelties mentioned earlier provide a good example. At the same time, it’s important not to consider the disassociation between genotype and phenotype as a hindrance to investigation or as noise that stops us from identifying truly homologous characters. There is a lot to learn from studying homology. This phenomenon provides an opportunity to understand how morphological novelties come about and the role co-option plays.

Beyond any confusion caused by multiple levels of homology, there are other common issues in the literature that quite frankly get on my nerves. The nomenclature of genes often makes it difficult. Dlx-2 in Xenopus is not orthologous with Dlx-2 in zebrafish. This example refers to paralogous genes that duplicated before the divergence that led to Xenopus and zebrafish. Even more confusing is when paralogous genes evolve by duplication in independent lineages. It can be extremely difficult to tell which of the duplicates corresponds to the ancestral gene. The homologous gene may have been lost, leaving only the paralogues. Clearly, relying on just one line of evidence isn’t always sufficient for identifying homology. Another major problem is the notion of “functional homology”, which confuses similarity due to common ancestry with similarity due to functional convergence. The functions of homologous genes can diverge from their original functions, or converge on the functions of unrelated genes. Both of these possibilities could confuse a researcher relying only on gene expression patterns as evidence of homology. Clearly homologous structures and genes can have different functions, so similarity of function is not a valid criterion for identifying homology, yet “functional homology” is still occasionally used in the literature. The solution to these two problems is to constantly consider phylogenetics and evolutionary histories when comparing gene expression data. By reconstructing the gene family in all the species being compared, the timing of gene duplications can be calculated relative to the divergences of the species. This approach should improve the likelihood of identifying true orthologues so that only their gene expression patterns are compared.

A third problem that is more difficult to solve (and happens to be one of my favourite biological topics) is the phenomenon of co-option. As discussed, this can lead to the recruitment of orthologous genes to be expressed in non-homologous structures during development. Arthropods, echinoderms, and chordates express distal-less in the distal region of their appendages during their outgrowth, but the structures themselves aren’t homologous. It has become important to distinguish the difference between homology of genes, developmental mechanisms, and morphological structures or other phenotypic characters. To use homology in comparative biology, researchers should observe that homology can exist at different levels and that true homology concerns the evolutionary histories of characters, rather than any general or functional similarity. This approach to homology should be used consistently in studies, whether studying gene expression, developmental mechanisms, or morphological structures. At least that’s what I think.

Not New Scientist

I like New Scientist headlines. I think it’s hard to take some science topics and make them catch the eye and make sense for all potential readers. It’s a tough job to do well. New Scientist have kept me in stitches with hilarious headlines over the years. Some I really dislike, such as the infamous “Darwin Was Wrong” headline. Others appear to be written by someone on acid. I’ve read about how flies unlock our understanding of slowing down or speeding up time itself, and the magazine has asked me to consider questions such as “does now exist”?

I mean, here’s an example:

BQlIsRxCIAEZUnH

It definitely works, because I stopped in my tracks while shopping and walked straight to the magazine when the headline caught my eye.

So, I’ve been getting better at coding thanks to various projects such as Deckard the Robot and my GenomeTweet acounts (HIV and E.coli are complete, yeast, fly and nematode are still running). I got bored last night so I decided to create an automated New Scientist headline generator on Twitter. This isn’t an attack on New Scientist. I certainly do have a problem with some of their sensationalism, but I love the wacky headlines that always make me smile. I don’t agree with many of their choices but I can’t fault their ability to choose eye-catching titles. Although the headline generator clearly has some creative input from me to make sure things run smoothly, the actual results for each tweet are a surprise and there are hundreds of thousands of combinations so I’m really enjoying seeing it run! Here are the first four tweets it created:

It’s only been running a few hours but already has 180+ followers at the time of writing (mostly scientists and science journalists). Clearly I’m not alone in enjoying New Scientist’s wacky headlines. Some thoughts on the new account:

Of course, it hasn’t escaped the notice of the lovely people at New Scientist. Fortunately they’ve taken it the right way.

Teaching Deckard to avoid obstacles

In my quest to learn programming languages, I decided to build a robot. His name is Deckard. He’s mostly made of Lego with a Raspberry Pi for a brain. He’s going to be learning lots of skills and he’ll be quite social as he tweets about all his actions as well. He’ll understand spoken commands, he’ll be able to explore, function as an embarrassing alarm clock (he can take photos and has access to Twitter),  reward work with biscuits, act as a security camera, tweet genomes, and be driven remotely from anywhere in the world. But it’s baby steps for now. He’s still a mess, very bulky and covered in cables. At the moment, I’m making sure that he’ll be able to avoid obstacles when exploring. Here’s a little video of him in action:

The next step is perfecting the voice recognition. You might have spotted a button on his left side near the back. When that button is pressed, he listens for 3 seconds for a command. I’ll upload a video of the voice commands when I’m happy with it. Once he’s a bit more complete you’ll find him tweeting at @DeckardRobot.