Complementary extraction methods to identify Park Grass soil metaproteome (from Quinn et al., 2022)
This ProteomeXchange dataset is derived from work on the Park Grass Experiment soil by Quinn et al. (Applied Soil Ecology, 2022) and made available via the PRIDE database. A modified protein extraction method, heat/thaw/phenol/chloroform (HTPC), was combined with the established Surfactant extraction method to identify proteins from Park Grass Experiment (PGE) soil, which has an extensively sequenced microbial database. The resulting dataset is made available on the ProteomeXchange Consortium with the dataset identifier PXD017392 and 10.6019/PXD017392. Project accession: PXD017392, Project DOI: 10.6019/PXD017392.
Sample Processing Protocol: The Park Grass Experiment (PGE) soil was sampled at Rothamsted Research. The soil was previously characterised as a stagnogleyic paleo-argillic brown earth, classified as Chromic Luvisol (F.A.O., 1990) or Aquic Paleudalf (U.S.D.A., 1992) (Silvertown et al., 2006). Soil samples were collected 3 meters outside a control (untreated) plot (Plot 3), similar to the areas sampled in Rothamsted Soil metagenome, which was used to construct the proteomic database (Delmont et al., 2011). Five different soil samples were collected from random (1 x 1 m) squares in an area of 25 m2. A section of turf within each sample square (50 x 50 cm) was cut on 3 sides with a serrated knife and peeled back to expose the soil beneath. Several kilograms of soil were removed from the 5-10 cm depth in addition to a soil core (100 cm3), which was used to calculate soil bulk density. Soil samples were transported to the laboratory immediately after sampling. The soil was mixed gently in the laboratory on the day of collection. One kilogram of soil was separated and frozen at -80 degrees C for proteomic analysis whilst the rest of the bulk soil was sieved to 5 mm and stored at 4 degrees C for biochemical and physical analysis.
Data Processing Protocol: Peptide identifications were accepted if they could be established at greater than 95% protein probability threshold, a minimum of 1 peptide and 50% peptide probability by the Protein Prophet algorithm (Nesvizhskii et al., 2003; Searle, 2010). Proteins containing similar peptides that could not be differentiated based on MS/MS analysis alone were grouped to satisfy the principles of parsimony (Searle, 2010). The combined peak lists and peptide and protein identification results were exported with a 2.3% false discovery rate (FDR) threshold. We noted protein identifications based on one peptide identification as low stringency and proteins identification based on two or more peptide matches as high stringency in the manuscript. Proteins from the soil database matching mass spectra from the soil protein extraction were identified through NCBI (18th July 2016). Proteins were further appended with phyla, protein families and GO terminology through EBI. The GO terminology was used to decide functional categories in the RAST/SEED categorisation. If the protein function crossed several categories, it was listed as unclassified. If the protein could not be traced to any category, it was classed as unknown. Not all proteins with protein ID and GI numbers could be classified and annotated with GO terms (from PRIDE Project PXD017392 accessed 09/11/22).
Uncultivated soil bacterium; NCBI TaxID: 97476.
The PRIDE partner repository made the deposit to the ProteomeXchange Consortium publicly available; dataset identifier PXD017392 and 10.6019/PXD017392. Project accession: PXD017392, Project DOI: 10.6019/PXD017392.
Sarah Perryman and Nathalie Castells created this webpage (published live 15th November 2022) which was checked by Gerry Quinn (17th November 2022).
For further information and assistance, please contact the e-RA curators, Sarah Perryman and Margaret Glendining using the e-RA email address: era@rothamsted.ac.uk