INTRODUCTION This directory is part of the official repository of public bioinformatics data sets available from IFREMER research teams. PROJECT searse SEARSE project: metabarcoding data (V4-18S, V4V5-16S) in seawater surface of the New Caledonia lagoon PUBLICATION Published to ENA under the accession number PRJEB66118 (currently private) doi: https://doi.org/10.12770/26840434-8354-4856-9d5e-e562c8de7252 DATA ORGANISATION Project is organized as follows: data is distributed within several sub-folders all of them being named using the EDAM Ontology (https://www.ebi.ac.uk/ols/ontologies/edam). 1/ Raw data is available from the 'data' sub-folder. 2/ Results from processing raw data is available in 'operation' sub-folder. 3/ In turn, sub-folders within 'data' and 'operation' rely on EDAM terms depending on data type (data sub-folder) or operation type (operation sub-folder) DIRECTORY CONTENT The directory of each sample is detailed in the report. The year corresponds to the arrival of the raw sequences and not to the time of sampling. In summary the sequences can be found at the following paths: - September 2019 / protist : "data/dna-sequence-raw/2019/protist" - September 2019 / bacteria (16S V3-V4) : "data/dna-sequence-raw/2019/bacteria" - September 2019 / bacteria (16S V4-V5) : "data/dna-sequence-raw/2020/bacteria" - February 2020 / protist : "data/dna-sequence-raw/2020/protist" - February 2020 / bacteria (16S V4-V5) : "data/dna-sequence-raw/2020/bacteria" - December 2020 / protist : "data/dna-sequence-raw/2020/protist" - December 2020 / bacteria (16S V4-V5) : "data/dna-sequence-raw/2021/bacteria" DATA RETRIEVAL wget -r -np -nH --cut-dirs=4 ftp://ftp.ifremer.fr/ifremer/dataref/bioinfo/pfom/livacid Command-line explained: -r: is for recursively download; -np: is for no parent ascending; -nH: is for disabling creation of directory having name same as URL i.e. ftp.ifremer.fr; --cut-dirs: is for ignoring number of parent directories. see https://data-dataref.ifremer.fr/bioinfo/ifremer/README.txt for more details.