Introduction ------------ This directory is the official repository of public bioinformatics data sets available from IFREMER research teams. All these sets are described on the Sextant portal: https://sextant.ifremer.fr/eng/Data/Catalogue#/search?fast=index&&any=bioinfo* In turn, each project directory contains an appropriate README.txt file containing a DOI to link data with a description. Data organization ----------------- All projects are organized in a similar way: data is distributed within several sub-folders all of them being named using the EDAM Ontology (https://www.ebi.ac.uk/ols/ontologies/edam). 1/ Raw data is available from the 'data' sub-folder. 2/ Results from processing raw data is available in 'operation' sub-folder. 3/ In turn, sub-folders within 'data' and 'operation' rely on EDAM terms depending on data type (data sub-folder) or operation type (operation sub-folder). Here is an example for Mass Spectrometry data set "COSELMAR" provided by Ifremer's DYNECO research team: dyneco/COSELMAR/ | |--data | |--mass-spectrometry-data | | |--negative-ion | | |--positive-ion | | | |-report | |-- operation |--mass-spectrum-visualisation | |-- report | |--workflows Data retrieval for non Ifremer users ------------------------------------ (Ifremer users: DATAREF data is directly accessible from DATARMOR; i.e. do not re-download data!). This repository of data is quite easily accessible using two protocols: - ftp://ftp.ifremer.fr/ifremer/dataref/bioinfo/ - https://data-dataref.ifremer.fr/bioinfo/ifremer/ Using both URLs target the same storage. Now, to download data: 1/ consider using ftp:// URL since this protocol (ftp stands for file transfer protocol) is dedicated to data transfer, so it is really faster than https. 2/ use either a command-line tool (such as wget or curl) or a UI-base tool (FileZilla or CyberDuck) to download data. Data download example --------------------- Suppose you are looking at this project using your web browser: https://data-dataref.ifremer.fr/bioinfo/ifremer/lep/bazoricus-momarsat/ To download data, we first convert the https URL to the ftp one, as follows: - from https:// URL, only consider that part: 'lep/bazoricus-momarsat' - append that part to this URL: ftp://ftp.ifremer.fr/ifremer/dataref/bioinfo/ - you end up with: ftp://ftp.ifremer.fr/ifremer/dataref/bioinfo/lep/bazoricus-momarsat Then, we use wget from the command line to get the entire project data: cd /some/directory wget -r -np -nH --cut-dirs=4 ftp://ftp.ifremer.fr/ifremer/dataref/bioinfo/lep/bazoricus-momarsat/ Command-line explained: -r: is for recursively download; -np: is for no parent ascending; -nH: is for disabling creation of directory having name same as URL i.e. ftp.ifremer.fr; --cut-dirs: is for ignoring number of parent directories; here, we use 4 to avoid creation of path 'ifremer/dataref/bioinfo/lep'; as a consequence, you will have directly the directory 'bazoricus-momarsat' within '/some/directory'