VertNet
About
Through a strategic partnership with data publishers worldwide, the
VertNet project
will provide a new platform that transforms the use of biodiversity data for scientific research, conservation, and education. Get involved by
liking us on Facebook
following us on Twitter
, or
circling us on Google Plus
Like What You See?
Get the RSS
Browse the Archive
Random post
Mobile version
Wednesday, December 6, 2017
Portal Usage Statistics Are (Almost) Back
Thanks to the financial support of the Museum of Vertebrate Zoology at Berkeley, we have fixed the issues that were preventing us from logging the VertNet statistics of data use. Usage statistics are being collected once again.
We are now working on the reporting and visualization of those stats, so that we can bring those back to the natural history collections community in a friendly, useful modality. We expect all of this to be up and running before the end of the year.
We apologize for any inconvenience that our data publishers may have experienced as a result of this outage.
Posted at 5:36 PM
Permalink ∞
Monday, December 12, 2016
Building a Zooarchaeology Network and the value of linked open data
Many years ago, when we created the first vertebrate data portal, we quickly became aware that there are many differences across museum collections.  Some subtle, some glaring.  Each discipline, not just vertebrates, utilizes its own social and technical practices to record and preserve data.  And even within very similar types of collections, institutions add their own particular spin to how they report and manage their data.
Zooarchaeological collections are a great example of a very different sort of collection from what everyone might consider to be typical museum collections and often contain a wide range of biological specimens, both vertebrate and non-vertebrate.  One major distinction between these specimens and those with which we usually work, is that they were uncovered during the excavation of cultural heritage sites and are associated with information not usually connected to other specimens including content describing their location within sites (i.e. its provenience), human use, and other interesting information.
Despite these differences, or maybe because of them, we’ve created a new project called ZooArchNet.  ZooArchNet is a joint effort between the VertNet team and the
University of Florida
, in collaboration with
Open Context
.  The goal is to mobilize zooarchaeological specimen data in a way that is Darwin Core compliant, but still includes crucial cultural context information necessary for these data to be as useful as possible for interdisciplinary research.
We’re super excited about this project and with what we’ve been able to accomplish so far, especially the approach we’ve developed with our friends at Open Context to leverage a linked open data implementation.  We will populate site identifiers in the dwc:locationID field that is shared with the same identifiers used by Open Context, so when you get zooarchaeological data from VertNet, you can easily discover (and view) the data describing site context by following the hyperlink to Open Context.
So far we have published just one dataset in this way, with more on the way (See the
Parnell Site, Feature 1 Zooarchaeological Data dataset
in the VertNet data portal).  Although we’re announcing our efforts to share zooarchaeological data now, VertNet has included these data for a long time.  Those datasets, however, lack the crucial contextual information that we’re trying to expose.
The creation of ZooArchNet is just beginning. If you are already making zooarchaeological specimens available via VertNet, and want to enhance their value further, or if you have data you’d like us to help you to publish,
give us a holler
!  We’d love to make those data available for research, education, and more!
Posted at 3:30 PM
Permalink ∞
Tuesday, October 11, 2016
New VertNet taxon-based snapshots are now available!
We’ve got snapshots from 2015 and 2016, plus a new 2016
archive of Traits data
available via CyVerse (formerly iPlant).  Moving forward, we’ll be using
CyVerse
for our DOI/ARK assignments and permanent archiving in the CyVerse Data Commons.
CyVerse met all of our needs in terms of access and ease of use and more.  We know, too, that VertNet users have come to expect a certain level of ease and simplicity when they use our products and services, whether VertNet resources are on our portal or someone else’s, and CyVerse helps us to maintain this standard of service.  The
Data Commons Repository (DCR)
provides us with an easily accessible public repository designed to store and share large data sets, as well as simple protocols for uploading and updating our resources within their Discovery Environment (DE).  Requesting DOIs is both straightforward and fast, plus users working within the CyVerse DE have access to a wide range of tools for analysis, management, and publication.  At the end of the day, it was a no-brainer for us and we’re excited to work with the CyVerse team moving forward.
So, without any further ado, here are all of the VertNet snapshots at CyVerse (and KNB) for your research pleasure.  Let us know if you use these resources.  We’re always interested to know what you’re doing. Just drop those DOIs into a browser and they’ll take you to the resource directly. Let us know if this
doesn’t
work for you!  Of course, feel free to tell us if you love it, too.
We’ll keep a list of all of our snapshots and their DOIs on the
VertNet Datasets, Tools & Code web page
September 2016 - CyVerse
Amphibia -
dx.doi.org/10.7946/P2F59W
Aves –
dx.doi.org/10.7946/P2K01C
Fishes –
dx.doi.org/10.7946/P2PP4B
Mammalia –
dx.doi.org/10.7946/P2TG68
Reptilia -
dx.doi.org/10.7946/P2Z59J
Traits –
dx.doi.org/10.7946/P23011
October 2015 - CyVerse
Amphibia –
dx.doi.org/10.7946/P2VC7X
Aves –
dx.doi.org/ 10.7946/P2059V
Fishes –
dx.doi.org/10.7946/P23W2C
Mammalia –
dx.doi.org/10.7946/P27P49
Reptilia –
dx.doi.org/10.7946/P2CC7K
April 2015 - KNB
Amphibia -
dx.doi.org/10.5063/F1VX0DF9
Aves -
dx.doi.org/10.5063/F1MG7MDB
Fishes -
dx.doi.org/10.5063/F1R49NQB
Mammalia -
dx.doi.org/10.5063/F1GQ6VPM
Reptilia -
dx.doi.org/10.5063/F10P0WX6
Posted at 11:58 AM
Permalink ∞
Monday, September 26, 2016
Sizing Up the Improved VertNet Portal
VertNet has released the single biggest change to the data portal since we launched VertNet in 2012. We want to tell you all about ALL of the new portal goodies, but first we want to highlight the newest portal feature: traits, specifically body length and body mass.
Yes, you can now use VertNet to discover length and weight for millions of specimens. Measurements such as these are often collected by researchers in the field, digitized, and eventually mobilized to VertNet, but these data are hard to find, non-standardized, and in places where a lot of people wouldn’t know to look. We’ve taken measures to change that.
No, we didn’t
add
any data to VertNet, we just liberated data that were buried deep in occurrence records. So, if you have ever wondered just how small is a pygmy shrew (c’mon, we know you have!), here are some answers:
adults or subadults are between 64-106 mm in length and weigh between1.8-7.35 grams
.  Because these measurements are linked to individual specimens that have other data associated with them (
e.g.
, their sex and location) we can also say that,
no, pygmy shrews do not seem to be sexually dimorphic and their weight (at least) shows no discernible trend with latitude
That took all of 5 minutes to discover utilizing the VertNet portal
to
find weight and length measures for
Sorex hoyi
Now that you are hip to
Sorex hoyi
, you might be asking yourself, “Surely, we know the weights of vertebrates already.  Aren’t there databases with these data out there already?” Yes, there are, but we know of no other resource where you can find these data for the
actual specimens
and all of their metadata. Most existing resources report species averages or ranges, but now we can do so much more! Now we have the means to look at trends in space and time and get a full view of the distribution of traits per species, or even across an entire clade (
e.g.
, all tigers).
The work to extract and re-assemble trait data from VertNet is its own saga, and took a ton of work by the whole VertNet team.  We want to tell you that story, but rather than tackle it here, we encourage you to look over the new
VertNet Traits Guide
in our Resources section of the portal. We’ve also written a paper that details the entire adventure.  It’s a real page-turner, but it’s in review at the journal Database right now. With luck, it should be available just in time for the holiday season and would make a great gift for the data junkie in your life.
Did we mention that the portal was updated with a bunch of other upgrades?  Stuff like a “Clear All” button in the Advanced search and an arrow that will take you back to the top of your results window regardless of how many hundreds of records you’ve viewed.  We’ve also more than doubled the number Darwin Core fields you can search in specifically for content.  We’ve put up the entire list for you on our
Portal Syntax page
. Oh, and the Spatial Quality tab is taking a break so that we can give it a much needed update.
We’d very much like feedback on all of these changes, especially the new traits feature. We believe a feature like this opens up new horizons for biodiversity specimen-oriented data platforms. We also recognize that it changes the ways we think about data quality issues. So, whether you want to know about the smallest shrews, Rodents Of Unusual Size, or blue whales (well, ok, amphibians, reptiles, birds and fish too), we are open for business to help you get those answers.
Posted at 1:22 PM
1 note
Permalink ∞
Tuesday, February 2, 2016
It’s Happened!  And we’re still here!
Yes!  It’s official. Your VertNet team is now a part of the iDigBio Data Mobilization Team and we’re cranking away at data publishing all of the vertebrates AND plants AND molluscs AND fungi AND any biological collection that we can find.
iDigBio has given VertNet a home for the next two years, and while we’re there, they’ve asked us to continue our current services with an expanded scope: all biological collections.  That means that everything we’ve learned in the execution of our previous successes is available to you and your collections.
The completion of the goals from our latest NSF grant (
DBI-1062193
) and supplement included the creation of a cloud-based web-portal that brought the four vertebrate portals (MaNIS, FishNet, ORNIS, and HerpNET) into an integrated vertebrate-focussed portal, improved sustainability by reducing cost by an order of magnitude, expanded discoverability and reduced failure via a cloud-based architecture, pioneered the use of new data quality and publishing software, and doubled the number of collections eager to participate in the data-sharing community.
So, if you’ve been a part of the VertNet community over the years, all of these services and all of our expertise will continue for you as it has since 1999, only better.  The VertNet portal will continue, too, and we even expect to have some new and cool things to share in 2016.  Bottom line: just give us a call or send us an email and we’ll answer your questions, improve your data, and get your collections published to VertNet, iDigBio, GBIF, or any other portal.
If you’re not a part of a data publishing community yet, but want to move forward with publishing your data, give us a shout.  We’re happy to apply our years of experience to your cause, including data quality improvement, database organization, and data publication.
To find us, visit the
VertNet Contact page
or the
iDigBio Directory
and we’ll get back to you right away.
Posted at 1:24 PM
2 notes
Permalink ∞
Tuesday, October 20, 2015
The future of VertNet: A new beginning
If VertNet was a member of the family Felidae, we’d be coming to the end of life #2.  Our first life began in 2008 when our ancestors, MaNIS, FishNet, HerpNET, and ORNIS, came together to find solutions to issues derived from aging technology and growing need.  At that time, the former National Biological information Infrastructure (NBII), a unit with the USGS, provided seed money to get us through the early years.
In 2010 VertNet received a grant from the National Science Foundation (NSF,
DBI-1062193
), and thus, we were thrust into our second life, a life devoted to the service of the
Classic Networks
our ancestors created and the data publishers that supported them.   We worked long and hard to meet the challenges presented to us.  We made great progress in the development of the VertNet portal and a wide range of new tools and services for the biodiversity community, such as
issue tracking
usage reporting
community data use norms
rVertNet
, and
licensing
, not to mention the thousands of hours spent on improving data quality and fitness for use across the network.  Of greatest importance, we published nearly 220 million occurrence records in the last three years - with millions more on the way.
Now, in 2015, our NSF funding is coming to an end.  We’ve worked some magic to make three years worth of funding last for almost six with help from an NSF supplement and additional amounts from here and there. With our second life effectively at an end we are happy to say a third life awaits us in 2016!  In the coming months, our small team will join the
iDigBio
team, based at the University of Florida.  As a part of iDigBio we will continue our work to clean and publish as many biodiversity collections into the
iDigbio data portal
at the same time that we continue to build upon our work on the
VertNet
and
GBIF
portals.
During this life, our focus will shift a little so that we can support the publication of data digitized by all of those participating in ADBC
TCNs
, but you can rest assured that our primary mission to publish the highest quality biodiversity data while providing the highest level of customer service will remain unchanged.  We’ll also continue our efforts to find a truly sustainable model for VertNet services.
For this third life, we are thankful to Larry Page and the iDigBio team for the opportunity to continue to serve the biodiversity community for another couple of years.  We will devote our efforts to iDigBio with the same zest for data we displayed in our second life and deliver to you, the users of biodiversity data, the best data about
Muridae
Phrynosomatidae
Passeridae
Acrididae
, and
Lamiaceae
any Felidae could ask for.
Posted at 2:00 PM
2 notes
Permalink ∞
Monday, October 12, 2015
VertNet Portal: Now with new features and functions
Huzzah!!  We’ve added new features and functions to the VertNet portal.  Many thanks to all of you for sharing your wants and needs in the portal.  These changes come directly from your feedback.  Here are some of the most important ones.
First, you’ll notice that there is a new landing page for the portal.  No more extra clicks to start your search.  Speaking of searching… can’t remember the institution code for the Bell Museum or the University of Wyoming?  Now you don’t have to.  We’ve added a drop down menu in the Advanced Search options that will allow you to scroll through the list of publishers (and their
institutionCodes
) to pick the one you need.  We’ve also added a new link at the top of each search page that will take you directly to a list of publishers and their acronyms.
Want to search for records within specific months?  Well, now you can, using the Advanced search options.  If you’re looking to find all of the wild boars collected or observed during the months February through May, for example, just type
Sus
into the
Genus
field,
scrofa
into
specificEpithet
, and drop a 2 in the first “Months from” box, and 5 into the second box.  Thus, you’ll get back
In fact there are even more Darwin Core terms available now, in the advanced search:
collectionCode
recordedBy
preparations
sex
continent
locality
order
, and
family
.  And for those of you who can’t resist stopping along the way to ask directions, we’ve added a few more blue information bubbles to help you scratch that itch.
If you’re a syntax nerd, you can still type all of your search terms and operators into the main search field at
.  The best part is that you can do that now for almost all
Darwin Core terms
.  Try a simple query such as
preparations:skull
for
or a more complex search,
specificepithet:iguana genus:iguana year >= 1980 year <= 1990 month >= 2 month <= 5 for all the iguana skulls collected between 1980 and 1990, during the months for February-May (
).  When you get your search results, you’ll notice that you now can see
recordedBy
sex
, and the expanded
eventDate
in the quick view for each record.
Finally, if you visit the
Publishers page
, you discover that you can now sort by columns to help you find what you need more quickly.  Each publisher’s page (
e.g.,
the MVZ
) now features more information, including the Creative Commons designation for each data set, plus you can now search by collection instead of only by institution, making your ability to focus your searches even easier.
If there is more you’d like to see in our portal,
please let us know
.  All of your feedback helps us to build a better portal.
Posted at 12:15 PM
Permalink ∞
Wednesday, April 8, 2015
The (Data)ONE thing about VertNet and Big Data
[UPDATE: We’ve got new data snapshots up using CyVerse as our repository.  Read more at
.  You can find all VertNet snapshots and DOIs at
.]
17,237,897
. That’s the number of occurrence records VertNet has published to the
VertNet data portal
- records from 193 data resources, containing 254 collections, shared by 82 publishers globally (and growing). All of these records are can be discovered and downloaded via our portal, with one exception; if you want to download ALL the records within a superset of the complete VertNet index, such as all mammals or birds or reptiles (a common request), then, KABLOOEY! You’ll find yourself waiting for an email that may never arrive since our current architecture limits us to download sets with a limit of ~400,000 records.
So, what is the solution?  The solution is the
KNB Data Repository
and
DataONE
.  We should also note that this same solution provides a nice means to archive the entire VertNet datastore.
Here’s how it works.  First, VertNet assembles its datastore into
BigQuery
, a Google tool, with which we run queries over the complete VertNet index. From these queries we create “snapshots” by taxonomy.  So far, we have created snapshots for five groups (more may be forthcoming):  Amphibia, Aves, Fishes
**
, Mammalia, and Reptilia. These snapshots are converted into CSV (comma-separated) text files and compressed into .zip files.
Next, we generate a metadata
***
file that describes the contents contained within each zipped dataset. Finally, we bundle up the metadata and the dataset and upload them to the
KNB data repository
and Viola!, we now have a persistent and publicly accessible home for large datasets.  KNB is a
member node of DataONE
that provides a persistent home for the VertNet snapshots (among many other archives from projects around the world), minting of a
DOI
(digital object identifier), so that you can refer to the dataset by name or by DOI, just like you would a publication.
We’ve just started this process, but we plan to generate snapshots of the VertNet index and to post them at KNB quarterly.  So, do you want to know what VertNet has to share about mammal or fish biocollections?  Jump in!  Just remember to cite these resources if you use them in your research.
Mammals:
Reptiles:
Amphibians:
Fishes:
Birds:
_______________________________
Actually, we’ve published more than 166M records, available via data portals such as GBIF, iDigBio, and others.
**
Yes we understand that “fish” is a paraphyletic assemblage but for the sake of simplicity and to avoid a ton of “fish-related resources”, we have bundled everything to “fish”.
***
The metadata we create is in the same format we use for any other resource in VertNet.
Posted at 4:07 PM
3 notes
Permalink ∞
Tuesday, March 24, 2015
VertNet Springing Forward, Data and All
Many years ago, we introduced the
VertNet Project
to you with a riddle: What’s small enough to fit in your pocket and big enough to capture whales?  Well in the four and half years we’ve been working with you we’ve managed to capture
70,428 Cetacea
. Not only have we captured many thousands of whales, we’ve also made data available about millions of their brother and sister animals with backbones (we’ve even published datasets for insects, plants, and other biodiversity for use in other portals - we just can’t pass up publishing non-vertebrate data, you know).
To be precise, as of March 12, 2015, We’ve published 166,254,927
occurrence records (species at a given place and time), and we’ve helped others to publish many millions more on their own.   That includes 17,237,897 predominantly biocollections records from 193 data resources (a resource is roughly equivalent to a data set); representing 254 collections; shared by 82 publishers - just in
the VertNet portal
. All 166.25M records are available via the GBIF data portal (32% of all GBIF records) while at least 9,182,050 records can be found in the iDigBio portal (34% of all iDigbio records
**
).
But we’ve been busy with more than data publishing. We’ve created a set of data quality services that make the data we publish more accurate, more complete, and more fit for use.  We’ve already published and incorporated into the VertNet portal more than 8.4M records from 83 resources that have been processed through our data quality “migrators”, a fancy term for the set of data checks we perform before we publish a dataset.  The process assesses, adds, and improves data content of a multitude of Darwin Core fields
***
with the goal of making them as complete, correct, and discoverable as possible.
Oh, and did we mention that this service is still made available at no cost to the community?  It’s first come, first served, but it is well worth the effort.  We’ll discuss our migrators in greater detail in a future post.
____________________________
The eBird dataset (containing citizen science observations) published on the behalf of Cornell contains >150 million records that are only available in the GBIF data portal.
**
We publish more records than iDigBio ingests because they do not accept records from non-US entities, nor do they accept observations.
***
Migrators make sure that data are included for as many Darwin Core fields as can be provided, checks for and reports on suspicious content for many terms (catalogNumber, coordinatePrecision, coordinateUncertaintyInMeters, decimalLatitude, decimalLongitude, day, month, year, continent, country, stateProvince, county, municipality, island, islandGroup, waterBody, order, family, genus), and tries to standardize the content of many fields (kingdom, phylum, class, order, family, genus, scientificName, continent, country, countryCode, stateProvince, county, municipality, island, islandGroup, waterBody, basisOfRecord, disposition, establishmentMeans, occurrenceStatus, geodeticDatum, georeferenceProtocol, georeferenceVerificationStatus, identificationQualifier, language, lifeStage, nomenclaturalCode, preparations, reproductiveCondition, sex, taxonRank, type, typeStatus, and verbatimCoordinateSystem) to facilitate searching while maintaining verbatim original data for classifications and geography.
Posted at 1:29 PM
Permalink ∞
Thursday, January 22, 2015
Rev Up Your Research with the All New rvertnet
After months up on cinder blocks in the VertNet garage,
rvertnet
has been completely restored and is ready for a test drive.  Strap on a helmet, kick the tires, and put this newly tricked-out r package through it’s paces.  We’ve retooled rvertnet to take advantage of the processing power of the new
VertNet API
Take a look under the hood and you’ll find the functions that drivers of any stripe need to take their research further, faster:
Daily Commuters can use
vertsearch()
to perform a good old global search of VertNet to return records (max. 1000 but see
bigsearch()
below) matching your query and ready for use in r;
Precision Enthusiasts will enjoy
searchbyterm()
for full control, using keywords (Darwin Core terms) to specify exactly how your query should be interpreted;
Adventure Drivers should pack the
spatialsearch()
when they hit the road to find records anywhere on the map — just give rvertnet a latitude, a longitude, a radius, and it will find every mappable record in VertNet within the defined area;
Truckers can pick up a heavy load using
bigsearch()
, which searches the VertNet warehouse for every record matching your query and delivers the whole load by email;
Sunday Drivers can use
vertmap()
to cruise the world — and zoom in on their favorite places — populated with friendly local data from VertNet.
Crash-test Dummies can get all the stats on their high-impact research using
vertsummary()
to summarize the results of any query.
rvertnet
search functions will return results in a full or compact data frame — always full for
bigsearch()
— sorted by Darwin Core term, and ready for your analysis.
Ready to take a spin?  Test drive VertNet in a whole new ride at CRAN:
You can get the package and check out our documentation at GitHub:
UPDATE: Links and documentation are also available on the VertNet Tools page:
Let us know how far you go.  Driver feedback is encouraged.
VertNet wishes to acknowledge mechanic
Chris Ray
(Univ. of Colorado) and paint and tune-up specialist
Scott Chamberlain
rOpenSci
).  Chris for the heavy frame mending and for rebuilding the engine, and Scott for the detail work and fine tuning.
Posted at 1:13 PM
Permalink ∞
← Earlier posts
Page 1 of 5
Tumbleroll
Colophon
This tumblelog is powered by
Tumblr
, and was designed by
Bill Israel