| |
| |
| |
| |
| |
|
|
|
|
|
| |
|
|
| |
| L |
M |
M |
J |
V |
S |
D |
| | | | | | | | 2 | 3 | 4 | 5 | 6 | 7 | 8 | | 9 | 10 | 11 | 12 | 13 | 14 | 15 | | 16 | 17 | 18 | 19 | 20 | 21 | 22 | | 23 | 24 | 25 | 26 | 27 | 28 | 29 | | 30 | 31 | | | | | | |
| |
|
|
|
| |
- Data use
constraints
- Data interchange
structure and contects
- Software
- Digital image use
- Sensitive Data
- Geographic
tools
- GBIF guides
|
|
|
Data Use Constraints |
|
When providing information through Internet it's convenient, in regard to the potential users, to state the conditions under which the data is provided. Intellectual Property issues include not only neither economic nor benefit-return matters, but other related with attribution and credit for the work done or the incorrect use of the data. One of the basic principles at GBIF is the explicit acknowledgement of the original providers of the information. There's further information on this issue in the sections below.
There are some other examples like the Cryptogamy Herbarium of the Royal Botanic Garden in Madrid, the REMIB (CONABIO in Mexico) and the Species Analyst Network (Univ. of Kansas - Natural History Museum and Biodiversity Research Centre).
|
| |
|
| |
|
|
Agreements within the Framework of GBIF |
|
All the original agreements related with data sharing and use are available in GBIF Secretariat web portal, in the following address:
http://www.gbif.org/DataProviders/Agreements/
|
|
|
|
GBIF Data Use Agreement |
|
All the potentials users of GBIF data must read and understand the GBIF Data Use Agreement before being allowed to access data. In the following document, both Spanish and English versions of this agreement are included:
GBIF Data Use Agreement - Spanish & English |
|
|
|
GBIF Data Sharing Agreement |
|
Before sharing any data in GBIF network, potential data providers must read and agree with GBIF Data Sharing Agreement. In the following document, both Spanish and English versions of this agreement are included:
GBIF Sharing Data Agreement- Spanish & English |
|
|
|
|
|
Intellectual Property Rights |
|
Guiding Principles Regarding Intellectual Property Rights |
|
Intellectual Property Rights discussion has naturally arisen within GBIF, as it has been conceived as a free access facility from its very beginning. On March 2004 a meeting was hold in Madrid about this issue: 'GBIF Experts' meeting on biodiversity data, databases and property rights issue', whose conclusions can be reviewed in the CIRCA system. In the following document, the main guidelines about IPR stated in the Memorandum of Understanding of GBIF are gathered, both in Spanish and English:
|
|
|
|
|
|
How to Cite Data from GBIF |
|
As it is stated in GBIF Data Use Agreement, data users must acknowledge data providers and collections for their data as appropriate. In the following documents, different ways of meeting this requirement are discussed.
Guidelines for citing specimen and observation data obtained via the GBIF Data Portal - Spanish & English
|
|
Guidelines for citing names data obtained via the GBIF Portal - Spanish & English
|
|
| How to cite GBIF data - White paper - Spanish & English |
|
These documents are drafts under development, available in GBIF Secretariat web portal (www.gbif.org) from May 6th, 2005. Comments about them are welcome till June 30th, 2005, at
|
|
|
|
Open Access to Biodiversity Data |
|
The GBIF Governing Board has issued a document with recommendations on open access to biodiversity data, addressed to research councils, other funding agencies and private foundations. The original paper can be consulted here. The Coordination Unit of the Spanish GBIF node provides a Spanish translation of the document:
Recomendation on Open Access to Biodiversity Data - Spanish (.PDF, 127 Kb) |
|
|
|
|
| |
|
Data interchange structure and contents |
To ensure interoperability when consulting different biodiversity databases spread around the world, it is essential to have standard datasets for natural history collections and observational databases. There are -at the moment- two main alternatives in this field: |
| |
| Darwin Core |
The Darwin
Core profile (GBIF, MaNIS, The Species
Analyst, etc.) provides a list of suggested access points
and recommendations for their use in searches within natural
history specimen and observation databases. It provides
suggestions for stringifying queries so they
become protocol independent. It also provides guidance
to the content, structure and format of records retrieved
from an information server supporting the Darwin Core. To obtain up-to-date information about Darwin Core, please visit http://digir.net/.
Schema definition
The original schema of Darwin Core 2 in XML format (ver. 1.0) can be consulted in digir.net webpages. There's also an easy-to-read version available.
Data and structures for Darwin Core migrations
You can download an example of a Darwin Core table in MS Access®, tested under MS Windows® with ODBC and with ADO: DarwinCore.zip (.ZIP file, 158 Kb).
There are two versions of the Darwin Core standard that are implemented by GBIF: Darwin Core 1.2 y Darwin Core 1.4. You can find detailed information about them at:
- Darwin Core Versions
- Darwin Core Concepts
Darwin Core is currently undergoing a new version that is not yet implemented by GBIF. You can find more information related to this new version at: http://www.tdwg.org/standards/450/
The Darwin Core new version in MS Access format can be found at: http://code.google.com/p/darwincore/downloads/detail?name=SimpleDwCMSAccess.mdb |
|
| ABCD |
This schema was developed within the BioCASe european project. It is a complex schema with nearly 1500 concepts/fields. The data providers adapted to ABCD format can also be accessed from the GBIF international data portal. To obtain further information about this standard, please visit:
|
| |
|
Communication protocols |
|
DiGIR |
The DiGIR communication protocol (Distributed Generic Information Retrieval)
is the one used at the moment to exchange information in several distributed networks like GBIF. This protocol is still under improvement and development, but it is already been used to connect hundreds of data providers. To obtain the most recent information about DiGIR project, please visit:
|
| |
| TAPIR |
The TAPIR protocol (TDWG Access Protocol for Information Retrieval) turns up as an alternative to exchange information between data providers and servers, compatible with both Darwin Core 2 and ABCD. The working groups related to this project are working actively, and there are some prototypes being tested at the moment. To know the latest news about this initiative, please visit:
|
| |
|
Data and structures for data migration |
Scientific Names |
Following these links you will find names lists to check the names included in your databases.
Specific groups
Global resources
A very interesting global sources of taxonomic information about animals, plants, microorganisms is Catalogue of Life.
|
| |
|
|
|
Software supported by the Coordination Unit of GBIF.ES |
|
The Coordination Unit of GBIF Spain develops software applications, freely and openly offered to the Spanish scientific community to facilitate data sharing through GBIF initiative. Software support and training is also offered freely to those researchers actively participating in GBIF. |
|
|
|
Software of Collections and Projects management |
|
|
|
When tackling a digitization process of a natural history collection, observational data obtained in the field,literature data or any other biodiversity data source, it is essential to know the tools available to carry out the task in an effective way.
Joining an existing well-known initiative ensures taking advantage of the expertise acquired by other research groups, making easy to avoid known problems. A smooth and effective performance of the digitization works becomes possible then, from the very beginning of the project. The most recent international standards in biological databases are surely covered, as it is sharing information between different databases, facilitating joining global initiatives like GBIF. |
|
|
|
HERBAR: Botanical collections management program |
|
 |
|
HERBAR is a computer application designed to manage and digitalize botanical collections. It is a very complete program, and it's supported and recommended by the GBIF Spanish Node.
It is the standard application of the Iberian and Macaronesian Herbarium Association - AHIM (Asociación de Herbarios Ibero-macaronésicos) and it is regularly used in dozens of Spanish institutions to manage herbaria and seed banks.
Please find further information about HERBAR in its website:
http://www.gbif.es/herbar/herbar_in.php |
|
|
|
|
ZOORBAR: Natural history collections management program |
|
 |
|
ZOORBAR is a piece of software to digitalize and manage natural history collections developed and recommended by the Spanish GBIF Node. Its flexible attribute system makes possible to adapt it to any biological collection.
This program is having an interesting introduction into the zoological community in Spain, and several significant institutions are already managing their collections with ZOORBAR.
Please find further information about ZOORBAR in its website:
http://www.gbif.es/zoorbar/zoorbar_in.php |
|
|
|
|
HERBAR-ZOORBAR LIGERO: Biological collections management program |
|
 |
|
Herbar-Zoorbar Ligero is a computer application designed to manage and digitalize either Botanical Collections or Zoological Collections. Its main goal is making the data exchange easier by using fast recording tables so, it is not neccesary to work with the complete versions of Zoorbar or Herbar.
Please find further information (in Spanish) about HZL on the following link:
http://www.gbif.es/hzl/hzl.php |
|
|
|
|
BIBMASTER: Biodiversity Information Manager |
|
 |
|
BIBMASTER is a database application to manage biodiversity information, specially focused in bibliography and nomenclatural information. It is developed and recommended by the Spanish GBIF Node.
This program can manage nomenclature, literature, specimen and taxon level information: reference lists, key-words, nomenclature, iconography, check-lists,
specimen-lists, herbarium labels and much more.
Several relevant national initiatives use BIBMASTER to manage information, like Flora iberica or Flora Micológica Ibérica projects.
Please find further information about BIBMASTER in its website:
http://www.gbif.es/bibmaster/bibmaster_in.php |
|
|
|
Software of Validation and Transformation |
|
|
|
|
|
FindIt2DarwinCore |
|
|
|
 |
|
FindIt2DarwinCore is a software designed to extract scientific names stored in PDF documents located on a web address, entering data into a table with Darwincore format. This application uses the "FindIT" web service located at www.ubio.org and can be downloaded at:
http://sourceforge.net/projects/findit2darwinco/
This software has been developed in collaboration with the UBIO staff. |
| |
|
|
|
|
|
|
Darwin Test |
|
|
|
 |
|
DARWIN TEST is a software application to validate and check DarwinCorev2 or Darwincore1.4 records (DarwinCore version 1.2 or Darwincore version 1.4 standard for specimen and observation data exchange).
Before publishing your biodiversity data in a public network such as GBIF it is highly recommended to test your DarwinCorev2 or Darwincore1.4 data using DARWIN TEST program, in order to detect possible problems. The issues analyzed include omission, typographic, convention and coherence errors. DARWIN-TEST is a Microsoft Access® based program. At present, the software is available only in Spanish.
Please find further information about DARWIN TEST in its website:
http://www.gbif.es/darwin_test/Darwin_Test_in.php |
|
|
|
|
Name Parser |
|
| |
 |
|
NAME PARSER
MS Access 2000® application to parse scientific names into their components: genus, species, species author, Infraspecific rank, Infraspecific
Epithet, infraspecific author and year.
Please find further information about NAME PARSER in its website (in Spanish):
http://www.gbif.es/name_parser/Name_Parser.php |
|
|
|
|
More digitalizing software |
|
There are plenty of initiatives in the software development field to manage and use biodiversity information. Some examples are shown in the following sections: |
|
|
|
VegAna from the University of Barcelona |
|
The Botany Unit from the University of Barcelona has developed an integrated software package called VegAna (Vegetation Edition and Analysis) to manage and analyse biological data . You can download the following applications from the project's main webpage:
- Ginkgo: Representation, classification and multivariate analysis of biodiversity data.
- Quercus: relevé data tables editor. Handles relevé data to perform phytosociological works.
- Fagus: floristic citation editor.
- Yucca: a cartographic plotting tool.
A new fascinating function has been added recently to this software: GBIF data import and use!
To find further information, please visit VegAna webpage:
http://biodiver.bio.ub.es/vegana/index.html |
|
|
|
GBIF Integrated Publishing ToolKit (IPT) |
|
|
|
The GBIF Integrated Publishing Toolkit (IPT) is an open source software platform developed by the Secretariat of the GBIF. One of the main technical challenges to the GBIF distributed IT architecture is to remove constraints to data publishing and flow. The new IPT offers simple interfaces to transfer complete data stores in order to publish biodiversity data.
http://ipt.gbif.org |
|
|
|
Other software |
|
You can check a complete list of software for digitizing natural history collections in the web pages of the TDWG Subgroup on Biological Collection Data. There you will find links to programs like HERBAR, SPECIFY, BIOTICA, RECORDER2000 and so on. |
|
|
|
|
|
|
|
The Use of Digital Images in Natural History Collections |
|
|
|
Modern methods to digitalize natural history collections and to increase the value of the collections and the databases behind them include specimen digital image capturing.
Digital image use offers many ADVANTAGES:
-
They are easy to manipulate and distribute: computer files are readily transported and stored in CD, DVDs or through the Internet.
- Disaggregation of collections' digitalization process. It is no longer necessary to complete all the digitalization effort in the collection site or to carry the specimens: image capturing can be done within the collection and processed elsewhere.
- Great amounts of information can be accessed with no need to travel to the collection site or to handle the specimens. A lot of research can be done using digital images, avoiding the cost and risk of damaging delicate specimens associated with loans and transportations. This point is especially important regarding delicate or unique type material.
- The Collections' curator and staff are released from most of the tasks related with loans and queries. This time can be invested in better curation or research.
- Researchers save the time needed to go to the collections site to check specimens' features perceptible in the digital image.
- The data (labels, identifications) and databases associated with the collections can be corrected and updated based on the information provided by the digital images of the specimens and their labels.
- High quality data sharing with countries of origin (data repatriation) is facilitated, providing them with 'virtual collections' built from the specimens collected in their territories.
- And so on.
|
|
|
|
Initiatives running |
|
Pioneer Projects |
|
Some of the most interesting international initiatives are listed below. These ones are remarkable because of their significance or their special characteristics:
This ambitious initiative from Harvard College is attempting to create a web-accessible electronic catalogue of all type specimens to be used by all taxonomists and specialists around the world. They maintain a complete online guide about digitalization and image capture of diagnostic characteristics.
The first results of this effort can be consulted in the online catalogues of their Caribbean insects collection and the type specimens of the MZD collection.
The New York Botanical Garden is one of the pioneer institutions in everything related to digital herbaria creation and access. About 800,000 digitalized specimens are available at the moment, with more than 120,000 high resolution images and its own search engine.

The Zoological Museum of Amsterdam, in collaboration with the National GBIF Node of Holland, and with the technical support of ETI Bioinformatics, has developed a project of types specimens digitalization in three dimensions (3D), whose first results -a birds collection- can be queried online. This innovative system allows to rotate the specimen and to view it under any desired angle.
The Center for Biodiversity and Conservation belongs to the American Museum of Natural History. Its mission is to mitigate critical threats to global biological and cultural diversity by advancing scientific research in diverse ecosystems; strengthening the application of science to conservation practice and public policy; developing professional, institutional, and community capacity; and furthering the Museum's efforts to heighten public understanding and stewardship of biodiversity
Aluka is an international, collaborative initiative building an online digital library of scholarly resources from and about Africa. Start-up funding for Aluka has been provided by The Andrew W. Mellon Foundation. Aluka seeks to attract high-quality scholarly content about Africa from institutions and individuals across the globe. By contributing their collections to the Aluka platform, content owners will have a means of offering access to their collections to an international audience—without having to develop and support their own technology platforms.
|
|
|
|
Other Initiatives
|
|
 |
|
Furthermore, there are additional global initiatives to capture digital images in natural history collections: some of them are really significant, like the project of Linnean Types Herbarium digitalization.
|
There are also several Spanish initiatives working in the same direction:
- Digital herbaria:
- Virtual zoological collections:
- Collections of type specimens:
|
|
 |
|
|
|
|
GBIF and Digital Image Use |
|
|
|
|
|
|
|
Methodology |
|
Here they are some resources to check when proposing or facing projects with a specimen digital image component:
-
The EUROPEAN NETWORK FOR BIODIVERSITY INFORMATION (ENBI) included different initiatives related to the use of digital imaging of natural history collection specimens in its 6th work package (Co-operation of pan-European databases on biological collections and specimens). Some of the most interesting related events are:
- A workshop on 'techniques and challenges for digital imaging of biological types' was held in Stuttgart (Germany) in March 2004. The results of this workshop, together with a detailed report, are available in its corresponding CIRCA section.
- A 'Manual of Best Practice' on digital imaging of biological type specimens was published at the end of 2005. Among the contributions included in this manual, the Spanish study of Arturo H. Ariño and David Galicia on the use of digital images in taxonomical studies stands out (Chapter 11th). A full electronic version of the manual is available to download here.
- A working group focused on biological images management has been established in the TAXONOMIC DATABASE WORKING GROUP (TDWG). They have launched a WiKi as a discussion and focal point, where new ideas as also published.
- The NEW YORK BOTANICAL GARDEN has made a manual on 'Procedures and recommendations for photographing and archiving type specimens of the New York Botanical Garden Herbarium' available. It includes different recommendations about the equipment needed, its configuration, lighting control, etc.
- The E-TYPE INITIATIVE web pages also include some information about protocols, considerations and methodology, when facing detailed images capturing.
- The COORDINATION UNIT OF THE SPANISH GBIF NODE organizes periodic courses and workshops on natural history collections digitalization, whose contents include the use of digital images. Please visit the events section of GBIF.ES web portal to be properly informed about future calls.
|
|
|
|
|
|
|
|
Sensitive Data in the GBIF Network |
|
Definition |
|
The concept of SENSITIVE DATA in the context of GBIF refers to those data about taxa whose geographical information publication could be problematic if shared unprotected. This situation can occur when publishing information about:
- Rare, endangered or legally protected taxa.
- Commercially valuable taxa.
- Showy or fragile taxa (Ex: orchids, nesting/roosting sites, etc.).
- Data subject to withholding request from landowner.
- Data used to derive income.
In other context, there is data that can be considered "temporarily" sensitive:
- Data awaiting publication.
- Data subject to ongoing research.
The concept of sensitive data, like those of endemicity or rareness, is closely related to the geographic area we are restricted to: data considered as sensitive in a given area, can be considered freely accessible in a different one. This is particularly relevant when talking about duplicates of vouchered specimens stored in different natural history collections, or data served from outside the influence area of the restriction, legislation, etc.
|
|
|
|
Sensitive data sharing in GBIF |
|
When facing the possibility of sharing sensitive data through the GBIF network, a geographical data generalization or deletion process must be conceived and run, before their publication.
One of the most relevant issues which can't be neglected when sharing sensitive data is to document the process carried out to generalize or remove the geographical information. Users must be aware that the data made available for them is not the original data. Nevertheless, the generalized data available in the internet will inform users of its existence and data providers can keep the capacity to evaluate individually the convenience of sharing more detailed information.
|
|
|
|
How to manage these data? |
|
The FIRST STEP is to find out which records of our set are considered sensitive in our scope. Speaking about Spain, it is essential to know the applicable international, national and regional legislation:
International Legislation
National Legislation
Regional Legislation
The SECOND STEP to take is to decide which method of protection we want to apply to the records:
- Totally remove geographical information: affecting both the textual description of the locality and the geographical coordinates (if available) of the collecting site.
- Partially remove the geographical information. Several methods are available:
- Round-down coordinates to a given number of decimal figures (Ex: use squares which define an area of 10x10 Km).
- Remove the contents of the most detailed fields of the locality (Ex: keep only the information about the country, state/province or municipality).
- Generalize geographical data: Original accuracy is maintained but the point does not match the original collecting/sighting site. Tools to carry out these tasks automatically are being developed: once they are available, they will be published here.
FINALLY, as it has been stated at the beginning of this document, it is essential to clearly indicate in a field ('notes' or a similar one) the process the records have undergone. Typical signs might read:
- This specimen represents an endangered or threatened species. The specific locality has been removed from the on-line record to protect this species from over-collection. These data may be supplied to researchers on request.
- This specimen represents an endangered or threatened species. The specific locality has been generalized to presence within a grid 1 minute resolution. Detailed data may be supplied to researchers on request.
|
|
|
|
1st Call of the Sensitive Data Management Workshop |
|
In October 2008 the Coordination Unit of the Spanish GBIF Node organised the first call of the Sensitive Data Management Workshop. In the framework of publishing sensitive data on line, restrictions arise mainly from the protection of endangered/legally protected species and from the safeguard of due attribution for the owner of data. Without minimizing the importance of the second issue, the workshop was focused on the first one: the protection of threatened species by means of the restrictive access to sensitive data.
Find further information (in Spanish) about the workshop on: http://www.gbif.es/formaciondetalles.php?IDForm=44
A summary of the discussion hold is available in English on: |
|
|
|
Available Documents |
|
If you are interested in obtaining more information, GBIF Secretariat in Copenhagen has issued several interesting documents related to sensitive data:
- In March 2006, a survey was carried out to evaluate what was considered sensitive data by biodiversity data providers, and the methods used to tackle the management of this kind of information. There is a report on the results of this survey in:
Questionnaire on Dealing with Sensitive Primary Species Occurrence Data (.PDF file, 567 Kb).
- When the analysis of the survey was finished and after having studied other similar initiatives, a complete report about dealing with Sensitive Primary Species Occurrence Data was made available.
Report - Dealing with Sensitive Data (.PDF file, 530 Kb).
- The final step in this process was to develop a Guide to Best Practices. This document should be seen as an overriding guideline for institutions, data providers and GBIF Nodes to use to develop their own in-house guidelines.
Guide to best practices for generalising primary species occurrence data (.PDF file, 558 Kb).
|
|
|
|
Georeferencing tools |
GEOLocate |
|
Traditional methods for georeferencing collection data (capturing map coordinates) from text descriptions are tedious and time consuming, typically involving finding the locality on either a hardcopy or digital maps, plotting the locality, and determining the coordinates.
|
GEOLocate is a comprehensive electronic georeferencing solution funded by the National Science Foundations and developed by Tulane University's Museum of Natural History designed to facilitate this task of assigning geographic coordinates to the locality data associated with natural history collections.
More information:
We provide two support contacts for GEOLocate users. Either you can conduct your questions, comments, queryes, etc. to the distribution list (GEOLocate-L) or to the following email address: ayuda_geolocate@gbif.es Both will be availbale soon.
|
| |
|
Mapping Tools |
Tools to display geocoded records derived
biodiversity data sources |
Downloadable software:
|
- C-squares mapper
The CSIRO Marine Research (CMR) c-squares mapper is a perl utility. Latitutes and longitudes are converted to c-squares, which are georeferenced tiles (squares) representing identifiable portions of the earth's surface.
Commonwealth Scientific and Industrial Research Organisation (CSIRO), Australia.
http://www.marine.csiro.au/csquares/
|
|
- DIVA-GIS
DIVA-GIS is free geographic information system (GIS) software. It can be used to create species distribution maps and grid maps of the distribution of biological diversity as well as predictions of species presence based on climate using the BIOCLIM or DOMAIN models.
Download: http://www.diva-gis.org/
|
 |
- gvSIG
gvSIG is a tool oriented to manage geographic information. It is characterized by a user-friendly interface, with a quick access to the most usual raster and vector formats. In the same view it includes local as well as remote data through a WMS, WCS or WFS source.
Download: http://www.gvsig.gva.es/index.php?id=gvsig&L=0&L=2
|
|
- Google Earth
Google Earth lets you fly anywhere on Earth to view satellite imagery, maps, terrain, 3D buildings and even explore galaxies in the Sky. You can explore rich geographical content, save your toured places and share with others.
Download: http://earth.google.com/intl/en_uk/
|
|
- Google Maps
Google Maps is a free web mapping service application and technology provided by Google that powers many map-based services including the Google Maps website, Google Ride Finder and embedded maps on third-party websites via the Google Maps API. It offers street maps, a route planner, and an urban business locator for numerous countries around the world.
Descarga: http://maps.google.com/
|
|
Online mapping applications:
|
|
|
|
| |
|
| |
| Geographic and species distribution tools |
| |
Displaying the position of a site geographically can help to detect errors and therefore to increase data quality. Some of these tools -known as geographic information systems (GIS)- also contain analytical functions which make possible the prediction of species' distribution.
One of the most relevant international initiatives in development nowadays is GEOSS, the Global Earth Observation System of Systems. It is envisioned as a large national and international cooperative effort to bring
together existing and new hardware and software, making it all compatible in
order to supply data and information at no cost. It is coordinated by the Environmental Protection Agency of the United States.
GEOSS can be considered as a counterpart of GBIF in the geographic information field. It is due to become a key element that will make possible impressive advances in the analysis and use of biodiversity information, as it will make tools and big datasets available for big-scale studies.
See also:
-
An introduction on basic concepts (coordenates, datums, projections…) to take into account when managing geographical information or georeferencing specimens (.PPT file - 3,24 Mb).
-
|
Geographic resources manual |
| |
The GEOteca web site hosts a compendium of geographic resources compiled by students and lecturers of the University Autónoma de Madrid within the framework of Programas de Inovación Docente.
It gathers geographical information that can be found on the Internet in its widest sense:
- Digital publications on geography and environment, degrees, departments, societies and literature.
- Spanish GIS services available via Internet and international resources.
- Satellite imagery relevant to Spain, image processing software etc.
- Server on statistical population data, local, regional, national and international data, products and natural resources, documentation, literature, etc.
Their webpage: http://www.uam.es/docencia/geoteca/geoteca.html |
| |
Data and structures for data migration |
When performing data migrations, its essential to attach our data to an internationally accepted coding to ensure compatibility with global initiatives and subsequent data exchanges.
- ISO 3166 code lists: codes and names from ISO 3166-1 (the country codes) and ISO 3166-2 (the country subdivision codes) in English and French.
- MS Access® countries table with ISO 3166-1 codes of two and three characters length and its names in Spanish: Paises.zip (.ZIP file, 21 Kb - Last update 12-02-2007).
- Table with the codes of country subdivisions (states, departments, provinces,...) as they are defined in the ISO 3166-2 standard (60 countries): Provincias.zip (.ZIP file, 45 Kb).
|
| |
|
|
|
GBIF guides and other recommended resources |
|
|
|
BioGeomancer Guide to Georeferencing |
|
 |
|
This publication is one of the outputs from the BioGeomancer project and discusses in depth best practices for georeferencing biological species (specimen and observational) data. The publication presents examples of how to georeference a wide range of different location types, and provides information and examples on how to determine the extent and maximum uncertainty distance for locations based on the information provided.
Released on: 22 August 2006
Written by: Arthur D. Chapman and John Wieczorek
Concerned URL: http://www2.gbif.org/BioGeomancerGuide.pdf
|
|
|
|
|
Principles of Data Quality |
|
|
|
Data quality and errors in data are often neglected issues with environmental databases, modeling systems, GIS, decision support systems, etc. Too often, data are used uncritically without consideration of the errors contained within, and this can lead to erroneous results, misleading information, unwise environmental decisions and increased costs. This paper expands on these issues and discusses a number of principles of data quality that should become core to the business of the natural history collections and observational communities as they release their data to the broader community.
Released on: 17 August 2005
Written by: Arthur D. Chapman
Concerned URL: http://www2.gbif.org/DataQuality.pdf
|
|
|
|
|
Principles and Methods of Data Cleaning |
|
|
|
This document examines methods for preventing as well as detecting and cleaning errors in primary biological collections databases. It discusses guidelines, methodologies and tools that can assist the natural history collections community and the observational communities to follow best practice in digitizing, documenting and validating information. But first, it also sets out a set of simple principles that should be followed in any data cleaning exercises.
Released on: 17 August 2005
Written by: Arthur D. Chapman
Concerned URL: http://www2.gbif.org/DataCleaning.pdf |
|
|
|
|
Uses of Primary Species-Occurrence Data |
|
| |
 |
|
This paper examines uses for primary species occurrence data in research, education and in other areas of human endeavor, and provides examples from the literature of many of these uses. The paper examines not only data from labels, or from observational notes, but the data inherent in museum and herbarium collections themselves, which are long-term storage receptacles of information and data that are still largely untouched.
Released on: 17 August 2005
Written by: Arthur D. Chapman
Concerned URL: http://www2.gbif.org/UsesPrimaryData.pdf
|
|
|
|
|
Guide to Best Practices for Generalising Sensitive Species Occurrence Data |
|
| |
 |
|
This Guide to Best Practices for Generalising Sensitive Species Occurrence Data provides a key for data providers to use in determining whether a species or attribute should be regarded as sensitive and its level of sensitivity, and provides guidelines on consistent wording for use in documentation. It also provides guidelines on methods for generalizing data, both spatial and non-spatial.
Released on: 31 March 2008
Written by:
Arthur Chapman and Oliver Grafton.
Concerned URL: http://www2.gbif.org/BPsensitivedata.pdf |
|
|
Significance of Organism Observations |
|
| |
 |
|
Observations of nature are the foundation of ecological studies, which use observations to search for patterns in nature, and biodiversity conservation. Organism observations (observational) data is a major constituent of “primary biodiversity data”.
Released on: 09 October 2008
Written by: Steve Kelling.
Concerned URL: http://www2.gbif.org/Observational_Data.pdf |
| |
| Observations on observational biodiversity data |
| |
|
|
This publication describes all the work done for ENBI at the University of Turku.
Released on: 21 March 2006
Written by: J. Salo, M. Vieno, T. Toivonen, I. Saaksjarvi, R. Kumpulainen, S. Juvonen.
Concerned URL: http://circa.gbif.net/Public/irc/enbi/comm/library?l=/enbi_reports |
| |
| A view on collection databasing |
| |
 |
|
Writter: Francisco Pando
Collection databasing is expensive and laborious, and their results are only visible in the long term. It is for that reason that such a task should be tackle from the perspective of the utility. When planning the computerization of a collection, is necessary to consider carefully the objectives and how the database is going to fulfill them. Doing otherwise is throwing away time and money.
From the experience acquired in collection databasing at our institution, we learnt that the most costly part of the computerization process is related to the transfer of the material between the herbarium and the computer, as much for the data entering as for data checking and proofreading. Therefore, it is more efficient to data enter everything for each specimen, for a limited amount of specimens (of a family, or a group of families, for instance) that to undertake a partial computerization of the whole collection. The computerization of a herbarium, must cover three objectives:
|
- To contribute to exploit, in a more complete way, the information than the collection contains. This is especially desirable in the fields of the Ecology, Phenology, History, Nomenclature and environmental impact assessment.
- To protect the specimens kept, by means of eliminating or reducing the need of manipulation of the material for many studies.
- To contribute to the management of the herbarium in tasks such as labeling, loan processing (in and out) and exchanges.
One of the dangers when building a collection database occurs when the effort that supposes its accomplishment is disproportioned in relation to the obtained benefits. For this reason, it turns out to be especially interesting to bind the collection computerization to active research projects, which in some way need the information contained in the collection. In this way, the utility of the work is clear and its viability consolidates.
As in any database exercise, perspective on reality should no be lost. It is nonsense to make a database if the specimens are not well preserved. It is equally unsound to database a collection if the specimens cannot be found, or are not accessible. The priorities should be stated clearly: preserve, make accesible, database.
A practical detail, usually overlooked, is the relationship between database and collection. It is of the utmost importance to guarantee that going from the database to the specimen, and the other way around, is not only possible but straightforward as well. This goal implies two requirements: a) any specimen must be unmistakably identifiable from the data in the database, (i.e. with a database record at hand, the specimens it refers to can pinpointed exactly). Here is when accession numbers, barcodes and the alike enter in action. It is also most convenient to provide a system know quickly whether a specimen is already databased or not.The other requirement is that any specimen can be quickly located with the data in the database. (e.g. in multi- identification databases, which one is the specimen included under.
Finally, I would like to emphasize how vital is for the success of any collection database project is to integrate it in the routine of the collection. An important step in this direction is to persuade researches and collectors who contribute material to the collection to use a database compatible with the herbarium's for their own records list and labels. Conceiving collection databasing as an extraordinary action, limited in the time, implies that, when the collection grows, the database are not longer a representation of the collection, its utility drops, it loses interest, and eventually, is abandoned.
URL: http://www.gbif.es/videos/video_detall.php?IDVideo=53 |
| |
|
|
|