This Wiki is the documentation for the BioVeL Biodiversity Virtual Laboratory, http://portal.biovel.eu/. If you discover problems or have suggestions for improving this documentation please send an email to support@biovel.eu

Please visit http://www.biovel.eu for background information about the BioVeL project itself.

Skip to end of metadata
Go to start of metadata

Description

This workflow takes as input a file containing species occurrence points to create a model with the openModeller Web Service. Algorithm, environmental layers and mask are selected during the workflow. The model is tested (internal test and optional cross validation external test) and then projected one or more times. All points from the input file are used to create a single model, so it is important to make sure that the records refer to the same species, unless you are interested in some sort of multi-species model. Cross validation calculates the mean AUC. Model projections can be downloaded from the links in the workflow output. They are geotiff files with suitability values ranging from 0 to 254 (nodata=255).

 

For more information about the input file format, click here. If you use the example occurrence points you should know that Gammarus tigrinus is an aquatic species, so you need to choose marine environmental layers during the modelling procedure.

Requirements: When running on the Taverna workbench, this workflow requires Internet connection and the interaction plugin installed.

Please note that ecological niche modelling experiments can take a long time to run depending on the parameters - sometimes several hours. This may happen with high resolution environmental layers, thousands of occurrence points and heavy algorithms, such as ANN and GARP BS. Cancelling a workflow run may not cancel the corresponding job on the server side, so if this procedure is repeated the server may get overloaded.

General

Name of the workflow and its myExperiment identifier

Name: Generic ENM workflow with interaction

The workflow pack can be downloaded from myExperiment workflow 3355

Date, version and licensing

Last updated: 04/12/2014

Version: 25

Licensing: Creative Commons Attribution ShareAlike CC-BY- SA

How to cite this workflow

To report work that has made use of this workflow, please add the following credit acknowledgement to your research publication:

The results reported in this publication come from processing data (<personal source or others--cite which, e.g. GBIF>) through BioVeL workflows and services (www.biovel.eu). The ecological niche modelling workflows were run on <date of the workflow run>. BioVeL is funded by the EU’s Seventh Framework Program, grant no. 283359.

Scientific specifications

Keywords

Ecological niche modelling, species distribution modelling.

Scientific workflow description

The ecological niche modeling workflow uses occurrence and environmental data to model ecological niches using the openModeller Web Service (http://openmodeller.sf.net/) (Muñoz et al. 2011). openModeller is an ecological niche modelling library providing a uniform method to model species distribution patterns with a variety of algorithms, including GARP, Climate Space Model, Bioclimatic Envelopes, Support Vector Machines and others. It combines species occurrence data with environmental datasets in the form of georeferenced raster layers (such as temperature, precipitation, salinity) to generate potential distribution models.

Key steps in the workflow:

  1. Create Model: The first step is algorithm selection and parameters specification, in which the user selects the modelling algorithm and parameter values to use. Different algorithms can be used to create niche models, and algorithm suitability depends on many factors including number of occurrence points, availability of absence data, type and number of environmental variables and purpose of the experiment. In the next step, environmental layer selection, the user defines the environmental raster layers with which to build the model. Next, geospatial mask selection allows the user to create or select a specific geospatial mask. The model is created using all input points inside the mask, and if the algorithm requires background or pseudo-absence points, they are also sampled from the masked region.

  2. Test Model: In this step, a statistical evaluation of the model prediction is performed. An internal test is run using 100% of the points used to create the model. The internal test calculates ROC curve, AUC and threshold-dependent statistics (accuracy and omission) using a fixed threshold (0.5). An external test (10-fold cross validation) can be optionally run, measuring the average AUC.

  3. Project Model: In this step, the user selects the layers with environmental data for model projection. The user has the option to create or select a geospatial mask for model projection (ie: to project the model in a different area to where it was created). At this step, it is possible to create additional projections (using for example future environmental layers such 2050). The projections and associated occurrence points are visualized through the web-based BioSTIF interface.

A diagrammatic representation of the workflow can be found here.

References
Muñoz, M.E.S. et al. (2011). "openModeller: a generic approach to species' potential distribution modelling." Geoinformatica 15(1): 111-135.

Technical specifications

The workflow has been developed to be run in the Taverna automated workflow environment. In its current form, the workflow file (with the .t2flow extension) can be loaded and executed in the workbench variant of Taverna. It has been tested with Taverna Workbench version 2.4. The workflow can also be run in BioVeL Portal, a light weight user interface which allows browsing, reviewing and running Taverna Workflows without the need of installing any software.

Example data file for Ecological Niche Modelling Workflow

This example data file contains 112 occurrence records of the amphipod crustacean Gammarus tigrinus.

ENM_example_data.csv

Input file format for Ecological Niche Modelling

The ENM workflow takes an input file that is a text file containing species occurrence points in CSV format.

Each line in the file corresponds to a different record with values separated by comma. The first line must be a header containing column names also separated by comma. The following columns are mandatory to run this workflow AND must be spelled EXACTLY as follows: occurrenceID, nameComplete, decimalLongitude and decimalLatitude. Other columns can be present on the file. Columns can be in any order, but they must match the order of the corresponding values.

Note: For the time being, the values for latitude and longitude should be passed in WGS84.

Example including extra columns:

authorship,genusPart,infragenericEpithet,specificEpithet,infraspecificEpithet,nameComplete,uninomial,taxonName,occurrenceID,decimalLatitude,decimalLongitude,earliestDateCollected,latestDateCollected,coordinateUncertaintyInMeters,country,collector,fieldNotes,locality,maximumDepthInMeters,maximumElevationInMeters,minimumDepthInMeters,minimumElevationInMeters,value,dataProviderName,dataResourceName,dataResourceRights,dataResourceCitation
,,,Cordylophora caspia,,Cordylophora caspia,,Cordylophora caspia,388530676,0.056756,2.463239,2011-01-01,2011-01-01,,Sweden,Fredrik Pleijel,,Idefjorden st 5,,,,,,ArtDatabanken,Artdata,,

All records are used to generate a single model regardless of the species name.

Installing ENM Workflow on Taverna Workbench

When running on the Taverna workbench, this workflow requires the Interaction plug-in to be installed.

Links and references relating to Ecological Niche Modelling

Several useful links and references relating to ecological niche modelling.

 

 

  • No labels