A biodiversity dataset graph: https://jhpoelen.nl/rats. 2021. hash://sha256/812da92d28f6abbd8b26be507168877ede7dfd78f7cc5b79b417316cf64ff78c

Biodiversity knowledge graph created using Preston v0.3.1 and Jekyll v3.9.0 on 2021-09-23T00:00:59.349Z

Welcome!

Are you looking for a way to have fast, local access to GBIF/iDigBio indexed records and media?

Would you like to have an exact copy of the images in your research dataset?

Do you want to include latest research data while keeping your original data around?

Introducing Content-based Biodiversity Data Archives.

This automatically generated website contains a versioned archive of a custom selection of occurrence/specimen records and associated media. The selection is made using currently available biodiversity search indexes like the iDigBio Search API or the GBIF Occurrence Search API. The indexed data is archived using Preston, a biodiversity data tracker that can version entire biodiversity dataset networks. Finally, the website is generated from the archived content using Jekyll, the static site generator that powers GitHub pages.

Archive Indexed Content

This biodiversity data archive website was created with the following steps:

# first, create a new blank jekyll site (needs Jekyll v4+, tested on Jekyll v4.0.1)
jekyll new [site_dir] --blank 

cd [site_dir]

# use GBIF/iDigBio API to track specimen, occurrences and their related images.
preston track "https://search.idigbio.org/v2/search/records/?rq=%7B%22genus%22%3A%22Rattus%22%2c%22hasImage%22%3A%22true%22%7D&limit=100&offset=0" 

# generate Jekyll site for archived content
preston copyTo --type jekyll . 

# launch website and visit http://localhost:4000 in your browser
jekyll s 

Clone data

You can clone an exact copy of the entire biodiversity data archive using:

# first, create a new blank jekyll site (needs Jekyll v4+, tested on Jekyll v4.0.1)
jekyll new [site_dir] --blank 

cd [site_dir]

# clone this existing biodiversity dataset graph 
preston clone "https://jhpoelen.nl/rats/data"

# generate Jekyll site for archived content
preston copyTo --type jekyll . 

# launch website and visit http://localhost:4000 in your browser
jekyll s 

Programmatic access

Access to Indexed Records

Also, you can query the indexed data available through this site via the api at https://jhpoelen.nl/rats/data.json (one json object per line, https://jsonlines.org/) .

With this, you can programmatically access the data and select the records you are interested in.

For instance, using curl and jq, you can show the first record by executing:

$ curl "https://jhpoelen.nl/rats/data.json" | jq -c 'select(.type == "records")' | head -n1 
[{"uuid":"012f4a15-41a4-4d41-b10d-455a559b6003","type":"records","etag":"d3a85eeffa9c798e581de1910d3bfb507776434d","data":{"dwc:startDayOfYear":"136","dwc:specificEpithet":"sp.","dwc:kingdom":"Animalia","dwc:recordedBy":"R. Crombie","dwc:order":"Rodentia","dwc:habitat":"Ecological remarks by collector(s): yes","dwc:individualCount":"1","dwc:occurrenceID":"http://n2t.net/ark:/65665/3f06530c3-541c-41df-976d-50bcf5008193","dwc:lifeStage":"Juvenile","dwc:associatedMedia":"13941065","id":"http://n2t.net/ark:/65665/3f06530c3-541c-41df-976d-50bcf5008193","dwc:country":"Belau","dwc:collectionCode":"Mammals","dwc:higherClassification":"Animalia, Chordata, Vertebrata, Mammalia, Eutheria, Rodentia, Myomorpha, Muridae, Murinae","dwc:waterBody":"North Pacific Ocean","dwc:basisOfRecord":"PreservedSpecimen","dwc:genus":"Rattus","dwc:family":"Muridae","dwc:sex":"Unknown","dwc:phylum":"Chordata","dwc:locality":"Koror, Grounds Of Palau National Museum","dwc:institutionID":"http://biocol.org/urn:lsid:biocol.org:col:34871","dwc:island":"Koror Island","dwc:institutionCode":"USNM","dwc:class":"Mammalia","dwc:catalogNumber":"577588","dcterms:type":"PhysicalObject","dwc:higherGeography":"North Pacific Ocean, Belau, Koror Island","dwc:endDayOfYear":"136","dwc:datasetName":"NMNH Extant Biology","dwc:month":"5","dwc:verbatimEventDate":"15 May 1992","dwc:recordNumber":"220132","dwc:preparations":"Fluid","dwc:scientificName":"Rattus sp.","dwc:day":"15","dwc:year":"1992"},"indexTerms":{"individualcount":1,"family":"muridae","recordset":"a6eee223-cf3b-4079-8bb2-b77dad8cae9d","dqs":0.34782608695652173,"phylum":"chordata","waterbody":"north pacific ocean","catalognumber":"577588","startdayofyear":136,"specificepithet":"sp.","uuid":"012f4a15-41a4-4d41-b10d-455a559b6003","basisofrecord":"preservedspecimen","collector":"r. crombie","institutioncode":"usnm","mediarecords":["035c8afc-6ec7-46d0-b2cc-9593fa8c2fd0"],"datemodified":"2021-04-18T11:29:37.741747+00:00","datecollected":"1992-05-15","etag":"d3a85eeffa9c798e581de1910d3bfb507776434d","recordnumber":"220132","hasImage":true,"kingdom":"animalia","highertaxon":"animalia, chordata, vertebrata, mammalia, eutheria, rodentia, myomorpha, muridae, murinae","scientificname":"rattus sp.","indexData":{"dwc:startDayOfYear":"136","dwc:specificEpithet":"sp.","dwc:kingdom":"Animalia","dwc:recordedBy":"R. Crombie","idigbio:uuid":"012f4a15-41a4-4d41-b10d-455a559b6003","dwc:order":"Rodentia","dwc:habitat":"Ecological remarks by collector(s): yes","dwc:individualCount":"1","idigbio:recordIds":["a6eee223-cf3b-4079-8bb2-b77dad8cae9d\\http://n2t.net/ark:/65665/3f06530c3-541c-41df-976d-50bcf5008193"],"dwc:occurrenceID":"http://n2t.net/ark:/65665/3f06530c3-541c-41df-976d-50bcf5008193","dwc:lifeStage":"Juvenile","dwc:island":"Koror Island","id":"http://n2t.net/ark:/65665/3f06530c3-541c-41df-976d-50bcf5008193","idigbio:parent":"a6eee223-cf3b-4079-8bb2-b77dad8cae9d","dwc:country":"Belau","idigbio:etag":"d3a85eeffa9c798e581de1910d3bfb507776434d","dwc:collectionCode":"Mammals","dwc:class":"Mammalia","dwc:waterBody":"North Pacific Ocean","dwc:basisOfRecord":"PreservedSpecimen","dwc:genus":"Rattus","dwc:family":"Muridae","dwc:sex":"Unknown","idigbio:siblings":{"mediarecord":["035c8afc-6ec7-46d0-b2cc-9593fa8c2fd0"]},"dwc:phylum":"Chordata","idigbio:dateModified":"2021-04-18T11:29:37.741747","dwc:locality":"Koror, Grounds Of Palau National Museum","dwc:institutionID":"http://biocol.org/urn:lsid:biocol.org:col:34871","dwc:associatedMedia":"13941065","dwc:institutionCode":"USNM","dwc:higherClassification":"Animalia, Chordata, Vertebrata, Mammalia, Eutheria, Rodentia, Myomorpha, Muridae, Murinae","dwc:catalogNumber":"577588","dcterms:type":"PhysicalObject","dwc:higherGeography":"North Pacific Ocean, Belau, Koror Island","dwc:endDayOfYear":"136","dwc:datasetName":"NMNH Extant Biology","dwc:month":"5","dwc:verbatimEventDate":"15 May 1992","dwc:recordNumber":"220132","dwc:preparations":"Fluid","flag_taxon_match_failed":true,"dwc:scientificName":"Rattus sp.","dwc:day":"15","dwc:year":"1992"},"hasMedia":true,"class":"mammalia","occurrenceid":"http://n2t.net/ark:/65665/3f06530c3-541c-41df-976d-50bcf5008193","institutionid":"http://biocol.org/urn:lsid:biocol.org:col:34871","country":"belau","locality":"koror, grounds of palau national museum","collectioncode":"mammals","flags":["taxon_match_failed"],"verbatimeventdate":"15 may 1992","recordids":["a6eee223-cf3b-4079-8bb2-b77dad8cae9d\\http://n2t.net/ark:/65665/3f06530c3-541c-41df-976d-50bcf5008193"],"genus":"rattus","order":"rodentia"}}] 

Or, look for an occurrence with images indexed by GBIF:

$ curl "https://jhpoelen.nl/rats/data.json" | jq -c 'select(.media[].type == "StillImage")' | head -n1 
[] 

Or, select archived iDigBio indexed records with scientific name matching Liphanthus sabulosus:

curl -s "https://jhpoelen.nl/rats/data.json" | jq -c 'select(.data["dwc:scientificName"] == "Liphanthus sabulosus")' 

Access to Archive Content

This biodiversity archive contains local copies of remote content.You can access a list of all archived location via the content registry at https://jhpoelen.nl/rats/registry.json ((one json object per line, https://jsonlines.org/)). Here’s a way to get the first one:

$ curl -s "https://jhpoelen.nl/rats/registry.json" | head -n1
{"url":"https://search.idigbio.org/v2/search/records/?rq=%7B%22genus%22%3A%22Rattus%22%2c%22hasImage%22%3A%22true%22%7D&limit=100&offset=0","verb":"http://purl.org/pav/hasVersion","hash":"hash://sha256/22514c875aa0832d3144c78ee5fc27af3029ecbefcaf9f5df0e25b3a79845cdd","graphname":"urn:uuid:d39cedb4-8ecc-4f2e-a9ba-560a0547f382"}

More Access Methods

The backbone of this biodiversity data archive is their provenance log, or knowledge graph. This knowledge graph is stored in in rdf/ nquad format and can loaded into triple store and queried using SPARQL.

Because the log is stored in a text file you can easily read it. For instance, the first 10 lines of the provenance graph (or knowledge graph) can be seen when running:

curl -s "https://jhpoelen.nl/rats/data/81/2d/812da92d28f6abbd8b26be507168877ede7dfd78f7cc5b79b417316cf64ff78c" | head 

For more information about how to fully take advantage of this biodiversity data graph, please review preston documentation for use cases, concepts and architecture.

Updating

Many natural history collections are actively digitizing their collections. These collections are actively indexed by iDigBio as the new data records and media become available. This website can be updated to incorporate newly added or updated data by:

cd [site_dir]

# archive records and related images with criteria specified in some biodiversity search API
preston update "https://search.idigbio.org/v2/search/records/?rq=%7B%22genus%22%3A%22Rattus%22%2c%22hasImage%22%3A%22true%22%7D&limit=100&offset=0"

# update Jekyll site with archived content
preston copyTo --type jekyll .

What is in this archive?

This archived dataset includes 100 iDigBio indexed specimen records, 114 iDigBio indexed media records, 0 GBIF indexed occurrence records, and 0 GBIF indexed images.

The first 10-20 records and their associated media included in this data archive are:

Specimen Record
urn:uuid:012f4a15-41a4-4d41-b10d-455a559b6003@jhpoelen.nl
urn:uuid:012f4a15-41a4-4d41-b10d-455a559b6003@idigbio.org
Media Record
Specimen Record
urn:uuid:048d0252-5517-4afc-9ce7-70949cdc8848@jhpoelen.nl
urn:uuid:048d0252-5517-4afc-9ce7-70949cdc8848@idigbio.org
Media Record
Specimen Record
urn:uuid:055daf04-0c5d-4d23-b25f-fabf27893dae@jhpoelen.nl
urn:uuid:055daf04-0c5d-4d23-b25f-fabf27893dae@idigbio.org
Media Record
Specimen Record
urn:uuid:05f55d82-8b8c-4ab5-a8c8-ae94c82f203c@jhpoelen.nl
urn:uuid:05f55d82-8b8c-4ab5-a8c8-ae94c82f203c@idigbio.org
Media Record
Specimen Record
urn:uuid:0acef499-d5e3-4445-a687-b22c62434ecb@jhpoelen.nl
urn:uuid:0acef499-d5e3-4445-a687-b22c62434ecb@idigbio.org
Media Record
Specimen Record
urn:uuid:0d29fee5-68fb-476b-870e-ec615fa05e95@jhpoelen.nl
urn:uuid:0d29fee5-68fb-476b-870e-ec615fa05e95@idigbio.org
Media Record
Media Record
Specimen Record
urn:uuid:1dee39de-ef66-4db8-aa7e-fe146fcd7478@jhpoelen.nl
urn:uuid:1dee39de-ef66-4db8-aa7e-fe146fcd7478@idigbio.org
Media Record
Specimen Record
urn:uuid:24ba25d8-293b-4573-99bf-1a876831480d@jhpoelen.nl
urn:uuid:24ba25d8-293b-4573-99bf-1a876831480d@idigbio.org
Media Record
Specimen Record
urn:uuid:2ad99708-053c-4534-9c63-f8cfb153e641@jhpoelen.nl
urn:uuid:2ad99708-053c-4534-9c63-f8cfb153e641@idigbio.org
Media Record
Specimen Record
urn:uuid:30fa17a4-9d57-4310-8425-5ee4151442d7@jhpoelen.nl
urn:uuid:30fa17a4-9d57-4310-8425-5ee4151442d7@idigbio.org
Media Record