There are many ways to characterize a plant, but perhaps everyone (including the first known botanist Theophrastus of Eresus, one of Plato’s students) starts with a simple distinction between woody and herbaceous plants.
For a recent research project on the evolution of this simplest plant trait stemming from a NESCent working group, we needed an estimate of the proportion of woody species in the world. The problem seemed tractable; perhaps even something we could readily look up in Wikipedia. However, when we looked, we found nothing in everyone’s favourite internet look up site; nor did we find anything in ISI, Google Scholar, or even more broadly in Google.
This absence of information led us to ponder how, in the era of “big data”, could such a simple question remain unknown? Splitting plants into herbaceous and woody dates back to Theophrastus (in ~300 BC), but still by all appearances such a partitioning of biodiversity into different forms is a blind spot in the accumulated scientific knowledge. The diagram represents the problem we faced:
As shown in the diagram, while we think we live in the time of “big data” with databases representing tens of thousands of species, our data are typically still a small fraction of global biodiversity. We should point out that for our NESCent project, we assembled the largest single plant trait database to date for one of the simplest possible traits–woodiness–and we still only reached values for 16% of known species (Zanne et al. 2014). GenBank which is the most successful crowd sourced open data effort in our field is only doing slightly better. The take home message then is that an understanding of global biodiversity that goes beyond simply a name is still absent.
Part of the problem is that today’s biologists, despite our access to modern gadgets, are still lagging behind Carl Linnaeus and those that continued after him. Global data require global standards, and in the Systema Naturae there was a clear filing cabinet in which to name and add newly encountered species. But for functional and genetic data we have not (until recently) had analogous filing cabinets for genes or functional traits. Thus despite our access to better tools, those interested in either functional or molecular diversity are still playing a big game of catch up.
All this is to say that there are a lot of missing data, and even if we wanted to rely upon the current data for woodiness to estimate the true values, we are faced with an additional problem that of sampling bias. The species that we do have data for are biased geographically, functionally, and phylogenetically. The temperate climates are over-sampled and economically important species are over-sampled. Oddities are often studied, while average species far from the nearest herbarium are rarely encountered by scientists or sampled for their genetic or functional traits.
These collection biases mean that it is not a simple task to estimate any summary statistics for global diversity based on the data currently at our fingertips, including addressing the question we started with–how many species in the world are woody. Based on the challenges we faced in both data discovery and sampling biases, the light finally dawned as to why the question wasn’t already known—the question’s apparent simplicity was in fact misleading. As everyone learns in introductory biology sampling bias is a serious problem.
All is not lost, however: we can once again turn to Linneaus’ filing cabinet to provide us with a way forward. What we found when searching in his cabinets was that most genera are either all woody or all herbaceous–only a few have a mix of herbs and woody plants. Families are less bimodal but many still are solely composed of one sort of plant or the other. This property of genera and families turns out to be quite a powerful observation, and with a new method we were able leverage this property to work our way toward what turned out to be a robust estimate of the percentage of species in the world are woody. Our estimate is between 45 and 48% (FitzJohn et al. 2014).
Now that we have an answer to our initial question, many other questions can follow suit. For instance, we can use similar approaches to ask the following, as long as taxonomic membership conveys something about a species’ likely function. What proportion of plants are pollinated by animals? What proportion of mammals are herbivores? What percentage of insects fly? While these seem like basic, Wikipedia style questions, their answers are currently unknown to both science and google searches. These basic questions about the world will provide fundamental insight into how global functional diversity is partitioned across space and time. Furthermore they are a first step towards understanding both macro-ecological and macro-evolutionary patterns–to understand the evolution of woody and herbaceous lifeforms in relation to climate (Zanne et al. 2014) on a global scale, we need to know some distribution of these traits across the tree of life.
These global questions about functional diversity are simple, but we are only now assembling the tools that let us answer them.
Will Cornwell & Rich Fitzjohn
FitzJohn, R. G. et al. (2014) How much of the world is woody? Journal of Ecology. doi:10.1111/1365-2745.12260
Smith, S. A., Beaulieu, J. M., Stamatakis, A., & Donoghue, M. J. (2011) Understanding angiosperm diversification using small and large phylogenetic trees. American Journal of Botany doi:10.3732/ajb.1000481
Zanne, A. E., Tank, D. C., Cornwell, W. K., Eastman, J. M., Smith, S. A., FitzJohn, R. G., … & Beaulieu, J. M. (2014) Three keys to the radiation of angiosperms into freezing environments. Nature doi:10.1038/nature12872