## Macroevolutionary drivers of extinction and quantitative stratigraphy of graptolites

##### Abstract

The last few decades have seen rapid expansion of large-scale paleontological databases that have allowed a much more methodologically rigorous approach to macroevolutionary patterns over geologic timescales. Studies have included insights into extinction, speciation, geographic range, ecological diversification, and the effects of climate change among others. However, these databases are heavily biased towards benthic shelly invertebrates, no doubt a reflection of their proportional contribution to the fossil record as a whole. Because of this it has not been clear how general observed patterns are to other groups, particularly planktic organisms which may face very different evolutionary pressures. Here, I examined some common correlates of extinction risk identified previously, apply them to a group of planktic organisms, graptolites, and compare the significance and effect size to results from benthic taxa. Of the properties associated with extinction risk geographic range is the most widely used and consistent. This property can be measured in a variety of ways that may impact results but there is relatively little published literature comparing different methods of measuring geographic range despite its widespread use. I explore how six measures of geographic range respond to changes in sample size as well their utility as correlates of extinction risk in the context of three disparate datasets. Finally, graptolites are of practical use in biostratigraphy and form the bulk of the global geologic timescale for the Ordovician and Silurian periods. Automated techniques have been developed to incorporate ever larger biostratigraphic datasets but the uncertainty in results has been difficult to characterize. I use a graptolite occurrence dataset assembled from the literature to build a novel ordinal composite for the Middle to Late Ordovician with the recently developed Horizon Annealing (HA) technique. From this composite I explore the limits and advantages of HA, with a particular focus on characterizing the uncertainty within and across the solution space of the ordinal composite. Analyses of 1114 graptolite species with general linear models found that all factors commonly associated with extinction risk in benthic taxa were also significant in this planktic clade. However, the magnitude of the effect of geographic range was much lower than typically reported. This is most likely because the relationship between geographic range and extinction risk is non-linear with the greatest gains in extinction risk between taxa with small geographic range values and those with moderate ranges. As graptolites generally have large geographic ranges, even substantial increases in geographic range only provide a modest decrease in extinction risk. Different measures of geographic range also varied substantially in their explanatory power, and unidimensional measures were particularly poor. When the covariance among factors was accounted for by the use of partial least squares regression the strongest correlate of extinction risk was overall commonness, which reflects the interdependent effects of sampling and geographic range. Among analyses overall commonness explained 12-30% of the total variance in extinction risk. Because these two properties reinforce one another they are essentially impossible to disentangle in fossil datasets where the number of occurrences per taxon is typically low. After taking account of the correlation geographic range and sampling, the individual contributions of either variate alone were either very low (<5%) or not significant at all. The second strongest correlates of extinction risk were clade and age cohort, which are strongly correlated in graptolite evolutionary history. These two variates individually explained as much as an additional 18% of the variance. While all individual measures of geographic range were significant correlates of extinction risk in graptolites their individual magnitudes varied substantially. To explore the cause of these differences I examined how six measures of geographic range (minimum spanning tree distance, convex hull area, maximum pairwise great circle distance, latitudinal range, longitudinal range, and equal area cell count) responded to sample size using three datasets: a simulated dataset with two differently shaped distributions of equal area, a fossil dataset of 381 brachiopod genera from the Paleobiology Database, and a set of 1152 modern bird species from the eBird database. Results showed that measures could be classified into two groups based on how their accuracy and precision responded to sampling. Group 1 measures (maximum pairwise distance, latitudinal range, and longitudinal range) rapidly became accurate and precise as sample size increased while Group 2 measures (minimum spanning tree, convex hull, and cell count) became accurate at a much slower rate, with one notable exception. In one of the simulated distributions the convex hull, as expected, became more precise as sample size increased, but unexpectedly, became less accurate at the same time. Group 2 measures also often showed a humped pattern where the lowest precision occurs at low, but not the lowest, sample sizes. This feature reflects the fact that variance at very small sample sizes is limited by either the method (cell count) or clustering of occurrences (minimum spanning tree and convex hull). Whereas the rank order of the value taxon geographic ranges was robust to variation in sample size across all measures, Group 1 measures had lower values at the smallest sample sizes.I further examined the six measures as correlates of extinction risk in the PBDB dataset using both logistic regression and general linear models. Logistic regression of the duration of generic survival beyond their stage of first occurrence showed that all measures of geographic range were significant and relatively robust to differences in how large versus small ranges were classified. General linear models of extinction risk also found that all measures of geographic range were significant but that the magnitude of effects varied significantly. Group 1 measures were relatively poor correlates of extinction risk in comparison to Group 2 measures. The relatively poor performance of Group 1 measures probably reflects that fact that these measure are not able to capture the complex nature of real distributions and because of methodological biases in these measures that make them vulnerable to outlier points and make some values more likely than others. The convex hull also suffers from vulnerability to outlier points and consistently overestimates geographic range. The minimum spanning tree method performs well as a correlate of extinction risk but is computationally expensive while equal area cell count performs even better and is not computationally expensive. For these reasons I advise the use of equal area cell count as a measure of geographic range in most cases, with minimum spanning tree as a supplement if the number of point occurrences is not too great. I also advise against the use of Group 1 measures and CH as measures of geographic range for extinction risk analyses. Finally, although previous automated biostratigraphic methods of ordination have proven powerful, the amount of uncertainty in solutions remains unclear, partially because such characterization is time-consuming. This is especially true for the Horizon Annealing (HA) approach, which uses more data than other methods. I examined how HA performed using a large dataset of 109 stratigraphic sections containing 136 graptolites species, one event bed, and three K-bentonites across 1549 horizons. The resulting composite generally agrees with published biozonations and independently developed composites affirming HA scales up effectively. To test for a previously hypothesized methodological bias (greater influence of first appearances than last appearances on determining the ordination of events in the composite), I ran the solution in reverse but did not find any evidence of this bias in the system. To reduce the initial burn-in time of searching in HA, I tested the use of a quasi-Bayesian scaffolding approach that starts the solution closer to traditional biozonations and found that it did significantly improve the penalty score of the solution, without any apparent bias. I further examined how effective the different ways of mutating the composite solution were at improving the solution during the search procedure. Mutations that allow changes in the spacing of collection levels within sections relative to the composite were much more effective than those that did not. I characterized uncertainty in the solution with three methods (vice, jackknife, and an island search). Vice was the fastest and most conservative measure, and it indicates an uncertainty range of 40 horizons on average, which corresponds to a temporal resolution of ~373 Kya. Finally, I characterized the uncertainty based on differences between three independent runs and found that although the global first and last occurrences of taxa were robust, the position of individual horizons varied more than was implied by any of the within-run uncertainty metrics. This result suggests that using HA composites to test dynamics based on patterns of occurrence within the temporal range of individual taxa should be done with caution. Although the accelerating pace of quantitative approaches to macroevolution have yielded more robust results than was previously possible there are substantial gaps in taxonomic coverage and methodological issues which have often been downplayed. The studies presented here attempt to address some of those issues. First, a large proportion of macroevolutionary studies have been based on patterns in shelly benthic invertebrates and the results then treated as if they apply to all groups of organisms. I demonstrate that while the same factors are often significant between these shelly benthic invertebrates and planktic graptolites the magnitude of relationships vary substantially. These contrasts in magnitude are reflective of the groups’ ecological strategies and can provide further insight into macroevolutionary processes that uniquely effect each one. Therefore, studies should be careful to interpret their results with regard to the ecology of the particular organisms being examined and to not be overly broad in claims about how generalizable a pattern is. Second, the treatment of geographic range in macroevolutionary studies has been inconsistent and the field of macroevolution would be well-served by some standardization. Here I found that unidimensional measures of geographic range, particularly the commonly used maximum pairwise distance, should not be used due to serious biases and limitations. More complex methods, minimum spanning tree distance and equal area cell count, were found to have the most desirable qualities as measures of geographic range and should be used in most macroevolutionary analyses. Finally, the use of ordinal composite built from biostratigraphic data for analyses of turnover, extinction, colonization, or other patterns requires a knowledge of the uncertainty in the system that was lacking. I demonstrated three methods of quantifying uncertainty in an ordinal composite as well as refining search methods that should allow for this powerful approach to be more widely utilized and understood.