Dataset Available

Benthic microbial community composition across the northern and southern Gulf of Mexico, 2012-2015

Authors: Will A. Overholt, Joel E. Kostka

Published On: Aug 09 2018 14:54 UTC

DOI: https://doi.org/10.7266/N70G3HN3

UDI: R4.x267.000:0086

Files

Individual Files

No. of Downloads: 15

No. of Files: 48

File Size: 3.04 GB

File Format(s):
csv, r, rds, txt, xlsx

Project Information

Funded By:
Gulf of Mexico Research Initiative

Funding Cycle:
RFP-IV

Research Group:
Center for the Integrated Modeling and Analysis of Gulf Ecosystems II (C-IMAGE II)

Point of Contact

Joel E. Kostka
Georgia Institute of Technology / School of Earth and Atmospheric Sciences
joel.kostka@biology.gatech.edu

Data Collection Period

2012-08-15

2015-08-09

Theme keywords

Biogeography, deep ocean sediment, microbial community, response and recovery, perturbations, oxygen, Deepwater Horizon, Ixtoc 1

ISO 19115-2 Metadata

View Metadata

Abstract:

This dataset includes 3 subsections. The first contains links to the 16S SSU rRNA gene sequences that were generated and deposited to the SRA at NCBI (Bioproject PRJNA414249). The second contains the raw oxygen concentrations for core profiles. The final contains the R code used to analyze these data, including R datafiles (RDS) that were used as input in predicting microbial community composition across the Gulf, the actual models (random forest regressions for each OTU), and the resulting predictions. Samples were collected from August 2012 to August 2015.

Suggested Citation:

Will A. Overholt, Joel E. Kostka. 2018. Benthic microbial community composition across the northern and southern Gulf of Mexico, 2012-2015. Distributed by: GRIIDC, Harte Research Institute, Texas A&M University–Corpus Christi. doi:10.7266/N70G3HN3

Publications:

Overholt, W. A., Schwing, P., Raz, K. M., Hastings, D., Hollander, D. J., & Kostka, J. E. (2019). The core seafloor microbiome in the Gulf of Mexico is remarkably consistent and shows evidence of recovery from disturbance caused by major oil spills. Environmental Microbiology, 21(11), 4316–4329. doi:10.1111/1462-2920.14794

Purpose:

Our objectives were to (1) characterize un-impacted sedimentary microbial communities to establish a baseline, (2) map the biogeographical patterns in microbial community structure across the Gulf of Mexico, (3) using this map, generate a Gulf of Mexico biogeography model of microbial community structure that can predict the abundance of dominant microbial populations, and (4) determine if impacted regions had returned to baseline conditions.

Data Parameters and Units:

Oxygen spreadsheet: Site (named sampling location), Cast (1,2,3 – referring to which deployment of the multicorer was sampled, in almost all cases only 1 deployment was performed and this number is 1), Core (1,2,3 – referring to replicates from a single multicorer deployment, nearly always 1), Date (“MM-YYYY”), Water Depth (m), Latitude (decimal degrees), Longitude (decimal degrees), Depth (mmbsf – mm below the seafloor – sediment column depth with 0 referring to the surface), Oxygen (µmol/L), Region (either NGoM or SGoM depending on sampling location, northern vs southern Gulf of Mexico) Sequences: Spreadsheet detailing sample names & BioProject ID for the NCBI SRA database Sample_Name, bioproject accession number, collection date (MM-YY), sediment depth (cmbsf, cm below the seafloor), Water Depth (m), env_biom (NCBI defined), env_feature (NCBI defined), env_material (NCBI defined), geo_loc_name (NCBI defined), lat_lon (decimal degrees), Core Name (the name of the site sampled), Cast_Replicate_Number (Core replicates from a specific sampling time point, 1-3 = Cast replicates, a-c = Core replicates within a cast, tech = technical replicates, replicate DNA extraction from the same sediment). R_script+Models: Defined in the attached csv form (index_README.csv). Root directory contains all R scripts, input_files contains datasets that are utilized by the R code, R_data_files contains R_data_structures generated by the R code, R_scripts_on_cluster has 2 scripts to generate model results using GA Tech’s high performance computing cluster. Operational taxonomic units clustered by taxonomic names at the class level and taxonomic assignments