The Neotoma Paleoecology
Database is a domain-specific data resource containing millions of
fossil records from around the globe, covering the last 5.4 million
years. The neotoma2
R package simplifies some of the data
structures and concepts to facilitate statistical analysis and
visualization. Users may wish to gain a deeper understanding of the
resource itself, or build more complex data objects and relationships.
For those users a partial list is provided here, including a table of
code examples focusing on different geographic regions, languages and
dataset types.
Data in Neotoma is associated with sites, specific locations with lat/long coordinates. Within a site, there may be one or more collection units – locations at which samples are physically collected within the site. For example, an archaeological site may have one or more collection units, pits within a broader dig site; a pollen sampling site on a lake may have multiple collection units – core sites within the lake basin. Collection units may have higher resolution GPS locations, but are considered to be part of the broader site. Within a collection unit data is collected at various [analysis units] from which samples are obtained.
Because Neotoma is made up of a number of constituent databases (e.g., the Indo-Pacific Pollen Database, NANODe, FAUNMAP), a set of samples associated with a collection unit are assigned to a single dataset associated with a particular dataset type (e.g., pollen, diatom, vertebrate fauna) and constituent database.
Researchers often begin by searching for sites within a particular study area, whether that is defined by geographic or political boundaries. From there they interrogate the available datasets for their particular dataset type of interest. When they find records of interest, they will then often call for the data and associated chronologies.
The neotoma2
R package is intended to act as the
intermediary to support these research activities using the Neotoma
Paleoecology Database. Because R is not a relational database, we needed
to modify the data structures of the objects. To do this the package
uses a set of S4 objects to represent different elements within the
database.
It is important to note, here and elsewhere: Almost
everything you will interact with is a sites
object. A sites
object is the general currency of
this package. sites
may have more or less metadat