TitleBodyTechnical Expertise RequiredCostAdditional Information

Dropbox is an on-line file storage and sharing service. 2GB of Dropbox is available for free, with subscriptions up to 100GB available. Shared folders allow people to work together on the same projects and documents.

Dropbox files are also available off-line, and folders can be synced between multiple computers and mobile devices. Dropbox therefore can be used as a backup mechanism for important files, although it is by no means a complete solution.

No programmingFree

A distributed version control system (DCVS). Git provides a distributed development, giving each developer/user a local copy of a repository, which includes the entire revision history. Changes are copied from one repository to another. Branching and merging are easy to do. Users are not dependent on network access or a central server so Git is very fast and scales well when working with large projects. It provides cryptographic authentication of history and offers tools for both easy human usage and easy scripting to perform clever operations.

No programmingFree

Mathematica is a computational platform used by scientists, engineers and mathematicians. Mathematica has support for equation solving, numerical analysis, as well as graphing and visualization. Mathematica has import and export filters for tabular data, images, video, sound, CAD, GIS documents and biomedical formats. There is support for data mining tools such as cluster analysis, sequence alignment and pattern matching as well as text mining support. The programming feature supports functional, procedural, and object oriented styles of programming.

Basic programming skillsCost-basis

See permissions for logo.....http://media.wolfram.com/logos/


MATLAB is an interactive data analysis and visualization environment that can be used to perform computationally-intense operations on large data sets efficiently. MATLAB also provides a high level programming language that supports rapid development of work-flow scripts and Graphical User Interface applications to automate repetitive tasks. A wide variety of discipline-specific software libraries, called toolboxes, are available from the publisher or user communities to extend the capabilities of the base program (e.g. statistics, curve fitting, image analysis and mapping). MATLAB programs can also leverage existing code written in Fortran, Java or other languages and source code is provided for most functions, allowing end-users to extend or customize routines for specialized analyses.

Basic programming skillsCost-basis
  • MATLAB documentation (http://www.mathworks.com/access/helpdesk/help/techdoc/)
  • Mathtools.net, a link exchange for technical computing (http://www.mathtools.net/MATLAB/)
  • Hanselman, D. and Littlefield, B. 2004. Mastering MATLAB 7. Prentice Hall. 864pp. (ISBN: 978-0131430181)
  • Scholarpedia article on MATLAB (http://www.scholarpedia.org/article/MATLAB)
Mercurial (Hg)

Mercurial is a free, distributed source control management tool and is used for version control of files. Mercurial is distributed, giving each developer a local copy of the entire development history.

  • It works independently of network access or a central server.
  • Committing, branching and merging are fast and cheap.
  • You can generate diffs between revisions, or jump back in time within seconds and is suitable for large projects.
  • Mercurial is platform independent. Most of Mercurial is written in Python, with a small part in portable C for performance reasons.
  • The functionality of Mercurial can be expanded with extensions, which can change the workings of the basic commands, add new commands and access all the core functions of Mercurial.
  • The basic interface is easy to use, easy to learn and hard to break.
No programmingFree

OpenMI provides users with a standard interface that allows the construction of modeling workflows. OpenMI allows models to exchange data with each other and other modeling tools as they run, facilitating the modeling of process interactions. Models may come from many different sources, represent processes from different scientific domains, have different spatial and temporal resolutions, and have different spatial domains/representations. The OpenMI standard is defined by a set of software interfaces that a compliant model must implement. These interfaces enable models to communicate with each other, with the possibility of two-way links between models where the involved models mutually depend on calculation results from each other. The OpenMI interfaces are available in both C# and Java. Models may run asynchronously with respect to timesteps.

As the OpenMI standard is a software component interface definition for the computational core (the engine) of the computational models, model components that comply with the OpenMI standards can, without any programming, be configured to exchange data during computation. Once developed, OpenMI models can be reused in many different applications and configurations. Most existing applications of OpenMI, and subsequently most of the available OpenMI compliant models, have been developed within the water resources domain.

Platform LSF

Platform LSF is a workload manager designed for use in large, high-performance computing environments. This commercial tool can be used to schedule complex scientific workflows and manage very large (up to petaFLOP scale) compute resources. It provides application support across distributed and heterogeneous platforms.

Basic programming skillsCost-basis
Project Trident

Project Trident is a scientific workflow workbench that allows users to author workflows visually by using a catalog of existing activities and complete workflows. The workflow workbench provides a tiered library that hides the complexity of different workflow activities and services for ease of use. Trident supports: analysis and visualization worksflows; composing, running, cataloging experiments as workflows, as well as capturing of provenance information. Workflows can be scheduled over high performance clusters or cloud computimg resources.

No programmingFree
  • Yogesh Simmhan, Roger Barga, Catharine van Ingen, Ed Lazowska, Alex Szalay, "Building the Trident Scientific Workflow Workbench for Data Management in the Cloud," advcomp, pp.41-50, 2009 Third International Conference on Advanced Engineering Computing and Applications in Sciences, 2009
  • Roger Barga, Jared Jackson, Nelson Araujo, Dean Guo, Nitin Gautam, Yogesh Simmhan, "The Trident Scientific Workflow Workbench," escience, pp.317-318, 2008 Fourth IEEE International Conference on eScience, 2008
    The Predictive Ecosystem Analyzer (PEcAn)

    The Predictive Ecosystem Analyzer (PEcAn) is an integrated ecological bioinformatics toolbox and data assimilation system that synthesizes information contained in ecological models, data, and expert knowledge. This is done using modern statistical methods and state-of-the art ecosystem models. PEcAn has a web interface that enables users to run ecosystem models, as well as a suite of R packages that can be used for model-data fusion and more sophisticated analysis.

    Basic programming skillsFree

    PEcAn can be used to run ecosystem models through a web-based user interface, while advanced analyses can be performed using its suite of R packages. New models can be linked to PEcAn through the creation of model-specific wrapper functions that convert translate to and from the standard formats used by PEcAn. Although PEcAn is currently coupled to three diverse ecosystem models, it can be coupled to a broad class of simulation models. Integrating a new model requires writing a wrapper in R to convert inputs and outputs to and from the standards used by PEcAn, and registering the model and computer in the database.
    PEcAn source repository is hosted at GitHub: (http://github.com/PecanProject/pecan). See the Wiki (http:// github.com/PecanProject/pecan ) for more information.
    LeBauer, D.S., D. Wang, K. Richter, C. Davidson, & M.C. Dietze. (2013). Facilitating feedbacks between field measurements and ecosystem models. Ecological Monographs. doi:10.1890/12-0137.1
    Wang, D, D.S. LeBauer, and M.C. Dietze (2013) Predicting yields of short-rotation hybrid poplar (Populus spp.) for the contiguous US through model-data synthesis. Ecological Applications doi:10.1890/12-0854.1
    Dietze, M.C., D.S LeBauer, R. Kooper (2013) On improving the communication between models and data. Plant, Cell, & Environment doi:10.1111/pce.12043


    Tika java class library available through the Apache group. It supports media type detection based on file type signatures, metadata extraction and text parsing and extraction.

    Supported Document Formats:

  • HyperText Markup Language
  • XML and derived formats
  • Microsoft Office document formats
  • OpenDocument Format
  • Apple iWorks Formats
  • Portable Document Format
  • Electronic Publication Format
  • Rich Text Format
  • Compression and packaging formats
  • Text formats
  • Audio formats
  • Image formats
  • Video formats
  • Java class files and archives
  • Mail formats
  • The DWG (AutoCAD) format
  • Font formats
  • Scientific formats
  • The Tika application can be run in either command line mode or as a graphical user interface (GUI) mode. Tika is written in Java and the class library can be used in directly in other programs where needed.

    Those with advanced programming skills can extend the Tikal to meet specific project or analysis needs not covered by the basic release. It is an open source project at the Apache Software Foundation and available under the Apache License version 2.0 (ALv2).