WGS

How is whole genome sequencing (WGS) data being used?

Whole genome sequence (WGS) analysis can be used for high-resolution subtyping of foodborne pathogens, such as E. coli O157, Salmonella and Listeria monocytogenes. WGS analysis has been used by regulatory (e.g., FDA-CFSAN, USDA-FSIS) and public health (e.g., CDC) agencies in surveillance, traceback and outbreak investigations. By comparing the WGS data from various isolates of the same pathogen, agencies can assess how genetically related the isolates are and identify (i) clusters of genetically related clinical isolates (i.e., cluster detection), (ii) clusters of genetically related isolates obtained over a long period of time from a single facility (i.e., persistent/resident strains), and (iii) associations between clinical and environmental/food isolates (i.e., source detection).

Genetic relationship is assessed through two main methods: single nucleotide polymorphism (SNP) differences and core genome multi locus sequence typing (cgMLST) allelic differences. Although these two methods are based on WGS data and tend to provide similar results, each method requires specific bioinformatic tools and has its own advantages and disadvantages. FDA-CFSAN uses SNP differences to assess the genetic relationship of isolates while CDC used cgMLST allelic differences to assess the genetic relationship of isolates.

Public databases host the WGS data and associated metadata obtained by regulatory and public health agencies and provide tools that allow for visualization of the WGS analysis (e.g., NCBI Pathogen Detection: https://www.ncbi.nlm.nih.gov/pathogens/). 

Resources for Industry

In the US, industry may be asked by FDA to participate in calls that discuss WGS results related to their facilities or supply chains. To prepare for these calls it may be valuable to review the following papers and webinars:

  • This paper details FDA’s interpretation of SNP differences: Interpreting Whole-Genome Sequence Analyses of Foodborne Bacteria for Regulatory Applications and Outbreak Investigations https://www.frontiersin.org/article/10.3389/fmicb.2018.01482
  • This paper describes the method used by FDA to assess the number of SNP differences between isolates: CFSAN SNP Pipeline: an automated method for constructing SNP matrices from next-generation sequence data https://peerj.com/articles/cs-20/

WGS webinars in partnership with Western Growers

  • This self-paced online curriculum hosted by Cornell Canvas, aims to provide (i) foundational knowledge needed to understand the genetics and genomics of microbial organisms, as well as the WGS data, how it is acquired and processed; (ii) intermediary knowledge regarding how WGS data is used to generate outputs, such as allele codes, phylogenetic trees and single nucleotide polymorphism (SNP) distance matrices, data visualization using SEDRIC, epi-lab communication; and (iii) advanced knowledge on specific topics (e.g., allele X codes, REP strains) that may present additional challenges to surveillance and outbreak investigation. 

 

REP Strains

Description: Reoccurring, Emerging, or Persistent (REP) strains are bacterial pathogen strains that CDC has identified as pathogens of interest, and FDA has identified as “reasonably foreseeable hazards” that need specific preventative controls.  

REP Strain Resources: 

(1) CDC data summary on E. coli O157:H7 REP strain linked to different sources, including recreational water, ground beef, and romaine lettuce: https://www.cdc.gov/ecoli/rep-strain/index.html#:~:text=REPEXH01%20is%20a%20persistent%20strain,ground%20beef%2C%20and%20romaine%20lettuce. 

(2) Check out our Food Industry Virtual Office Hours on REP strains and their relevance to produce safety, developed by Martin Wiedmann and Renato Orsi, and presented by Martin Wiedmann on February 26th, 2024: https://www.youtube.com/watch?v=Qk_I8FdLZYI 

This session of Food Industry Virtual Office Hours features Dr. Martin Wiedmann, discussing identifying and monitoring Reoccurring, Emerging and Persisting (REP) strains of several foodborne pathogens, including Salmonella, E. coli O157:H7, Listeria, and Campylobacter. REP strains are identified based on the number of illnesses and outbreaks, whether illnesses are increasing, the characteristics of the strain (e.g., multidrug resistance, increased virulence and/or transmissibility). CDC uses information gained from investigations of REP strains to better understand their sources, track how they change over time, and collaborate on measures to reduce their spread. Sources associated with REP strains include leafy greens, potatoes, cheese, chicken, turkey and cows (beef and dairy). This office hours will provide an introduction on REP strains and the (possible) implications of these strains for industry. For example, for some of these strains there may be expectations for industry to implement enhanced control strategies. 

Companies are also strongly encouraged to recruit a WGS expert to join the call with them. Members of the Produce Safety CoE are willing to serve in this capacity. Please contact Martin Wiedmann (martin.wiedmann [at] cornell.edu (martin[dot]wiedmann[at]cornell[dot]edu)) or Renato Orsi (renato.orsi [at] cornell.edu (renato[dot]orsi[at]cornell[dot]edu)) if you need help.