Towards an Australian Map of Environmental Microbial Communities

In order to start to develop a map of Australian environmental microbial communities, Bioplatforms Australia (BPA) and the research community have formed an Australian Microbiome Network and collectively invested approximately $10M in the collection of thousands of soil and water samples samples around the country, the extraction of DNA from these samples, and the production of DNA sequence data from these samples through the Biomes of Australian Soil Environments (BASE) consortium, and Marine Microbes (MM) consortium.


Australian Microbiome Network (2017)


The network, through the BASE and MM consortia have established robust sample collection and metagenomics protocols and along with BPA, have established a data repository to house the raw and derived microbial diversity BASE data and MM data as well as contextual information for the collection sites and times (e.g. geolocation, local soil chemistry, water content etc) which makes the datasets findable in the data repository.


BASE collection sites (March 2018)



The general BASE/MM workflow is shown below. 
  1. Unique identifiers are assigned by BPA for the soil/water samples to be collected (LHS top). 
  2. When a site is selected, soil or water samples are collected at each corresponding site, then DNA is extracted, amplified in specific regions (e.g. over the bacterial 16S rRNA gene) and then sequenced. The raw sequence data is fed into the BPA Data Repository (LHS middle). 
  3. Contextual information (including the unique identifier assigned in step 1) about each sample collection site is compiled by the collection teams off-line using Excel spreadsheets. These are then uploaded into the BPA Data Repository (LHS bottom).


Some computational pipelines are also applied to the raw data - for the BASE and MM projects these are managed by CSIRO and do three things: 
  • determine all the OTU (organisational taxonomic unit) sequences found in each sample 
  • determine which taxa correspond to all the OTU sequences (e.g. What species is each particular OTU associated with? Or if the OTU cannot be matched to a specific species: What larger taxonomic unit does it correspond to (e.g. genus, etc)?)
  • determine the abundance of each OTU in each sample
In the above diagram, for simplification purposes, the computational pipelines applied by CSIRO for these consortia are shown within the BPA Data Repository.

Comments