Assignment Task
General background information and overview.
Speciata pretendus is a haploid fungus with a life-cycle similar to that of baker’s yeast Saccharomyces cerevisiae . This organism is used in the production of a biotechnologically important compound. However, a by-product of the process results in two compounds ‘ Compound A’ and ‘Compound B’ , which are converted to ‘Compound C’ which is considered a contaminant. This occurs via a biochemical pathway involving ‘Enzyme 1 that is illustrated below.
A new research lab is investigating a number of strategies to try to reduce the production of Compound C. The Enzyme 1 protein is involved in other essential biochemical reactions at a low level and cannot be mutated, so instead they are attempting to control its expression so that it is not expressed at a high level. Their research focuses on a gene, called regN , that codes for a transcription factor called ‘Regulator N’ which acts as a negative regulator of enz1 gene expression.
Outlined below are four approaches that are being used to investigate how they can manipulate repN and other regulatory genes to reduce Compound C production in S. pretendus and related species. Each of these approaches will correspond to a section of the assignment for which you will need to analyse data and explain your rationale.
Section 1. They want to introduce specific mutations to increase the activity of RegN. Your task is to evaluate the consequences of different mutations on the ability of RegN to regulate Enzyme 1.
Section 2. A related species, Speciata hypothetica , does not appear to produce Compound C under laboratory conditions. They have cloned and sequenced a molecule of DNA that they think may contain a gene related to the regN gene of Speciata pretendus . Your task is to confirm the annotation of the regP gene from S. hypothetica .
Section 3. They have found two additional genes, regM and regL, in S. pretendus that contribute to Compound C production and they want to determine how these interact with regN . Your task is to study the data from genetic crosses to determine the genetic interaction and linkage between genes regulating the expression of the gene for Enzyme 1
Section 4. They wish to explore how to manipulate the expression of Enzyme 1 to reduce the production of Compound C. Your task will be to predict how mutations in regulatory genes will affect gene expression and Compound C production .
I nstructions for completing the assignment
- In each section you are provided with the relevant information and appropriate data to complete the analysis.
- The questions to be addressed for each section are provided at the end of the document.
- The total word count should not exceed 1000 words in total. Approximate word counts included in square brackets next to the questions have been indicated as a guide only.
Section 1: Evaluating the consequences of mutations
The Regualtor N protein contains multiple regions that are important for its function. Each of these regions is shown on the annotated DNA sequence provided on pages 3 and 4.
- The C 6 zinc cluster domain which is required for zinc binding and binding to DNA is shown in gold coloured text which is also underlined . The characteristic sequence of the domain is represented as CysX 2 CysX 6 CysX 5-12 CysX 2 CysX 6-8 Cys (where X may denote any amino acid). The specific Cysteine residues involved in co-ordination of the zinc atoms are highlighted in aqua.
- A linked region adjacent to the zinc finger that is required for recognising specific target DNA sequences is shown in red underlined text.
- A region required for dimerization, through which RegN forms homodimers that inhibit its ability to bind DNA is indicated in grey underlined text .
- A C-terminal region of the protein that allows it to function as an activator for some target genes (but not for enz1 ) is shown in the yellow highlighted text. The acidic and hydrophobic residues are particularly important for this activity.
The DNA sequence variants to be analysed are provided below, including the details of the mutation and nucleotide position. The positions refer to the relative position in the sequence of the gene and polypeptide sequence provided on pages 3 and 4.
Variant 1 T to A transversion at 1470
Variant 2 C to G transversion t 1173
Variant 3 T to G transversion at 2764
Variant 4 A to C transversion at 688
Variant 5 T to G transversion at 634
Variant 6 Deletion of 5’GAGATT3’ at 667
Your task is to:
- Analyse each of the variants and identify the type of mutation (missense, nonsense, frameshift, silent etc). Determine the likely consequence on the properties of the transcription factor and it’s function.
The potential consequences of the sequence polymorphism on the RegN protein properties and function include:
- Dramatic, subtle or no change in RegN polypeptide properties such as charge and size
- Dramatic, subtle or no change in enzyme 1 ( enz1 ) gene expression. If there is a change indicate if this will be an increase or decrease in gene expression
- Other impacts on RegN function
If you wanted to increase the activity of RegN to result in less Compound C, which one or two variants that you would predict could be useful? And which one or two variants would you definitely not use?
Section 2: Annotation of the regN gene from S. hypothetica
Two research students studying the organism S. hypothetica have obtained a DNA sequence for a gene that is believed to be related to regP Nfrom S. pretendus (gene analysed in Section 1). The two research students are having a disagreement on the correct annotation of the regN gene. The different DNA sequence annotations (annotations 1 and 2) are provided on pages 6 and 7. Both forward and reverse DNA strands are shown, but since the students agree that the coding sequence of the gene is on the top (forward) strand, only three reading frames are shown. The predicted mRNA and protein for Annotation 1 are in purple text , whereas for Annotation 2 they are in orange text .
Your task is to:
- Analyse each of the annotations and provide an explanation for which annotation you think is most likely to be correct.
- Explain why you do or do not believe that the sequence isolated from hypothetica contains a gene that is related to regN from S. pretendus (analysed in section 1)
Section 3: Interaction between genes regulating the expression of the gene for Enzyme 1
As mentioned in the background information, Enzyme 1 is required for production of Compound C, and is repressed by Regulator P. Several other factors are also believed to regulate Enzyme 1 expression, including Regulators M and L. Null mutants in the genes coding for Enzyme 1 and Factors N, M and L have been created and assayed for the amount of Compound C that is produced (Table 3.1).
To analyse the possible genetic interactions of the factors regulating Enzyme 1, pair-wise crosses of the null (complete loss of function) mutants in regN, regM, regL were performed. As mentioned previously, the lifecycle of S. pretendus is similar to that of the baker’s yeast S. cerevisae . This includes a sexual life-cycle where single-celled haploid parent cells fuse to create a transient diploid, which then undergoes meiosis to produce haploid spores.
The amount of Compound C was measured for the progeny of each cross. The progeny of each cross were classified into classes based on the amount of Compound C observed (Class A 0-200 units; Class B 200-400 units; Class C 400-600 units; Class D 600-1000 units; Class E More than 1000-1400 units). The number of each progeny class produced for each pair-wise cross are shown in Figure 3.2.
Your task is to:
- Analyse the data provided which shows the number of progeny observed for each phenotypic class for each cross and explain the linkage relationships between the three genes encoding Regulators N, M and L
- Describe which of the proposed models for how Regulators N, M and L interact to contribute to Enzyme 1 [removed]Figure 3.3) is best supported by the data.
To analyse the possible genetic interactions of the factors regulating Enzyme 1, pair-wise crosses of the null (complete loss of function) mutants in regN, regM, regL were performed. As mentioned previously, the lifecycle of S. pretendus is similar to that of the baker’s yeast S. cerevisae . This includes a sexual life-cycle where single-celled haploid parent cells fuse to create a transient diploid, which then undergoes meiosis to produce haploid spores.
The amount of Compound C was measured for the progeny of each cross. The progeny of each cross were classified into classes based on the amount of Compound C observed (Class A 0-200 units; Class B 200-400 units; Class C 400-600 units; Class D 600-1000 units; Class E More than 1000-1400 units). The number of each progeny class produced for each pair-wise cross are shown in Figure 3.2.
Your task is to:
- Analyse the data provided which shows the number of progeny observed for each phenotypic class for each cross and explain the linkage relationships between the three genes encoding Regulators N, M and L
- Describe which of the proposed models for how Regulators N, M and L interact to contribute to Enzyme 1 [removed]Figure 3.3) is best supported by the data.
To analyse the possible genetic interactions of the factors regulating Enzyme 1, pair-wise crosses of the null (complete loss of function) mutants in regN, regM, regL were performed. As mentioned previously, the lifecycle of S. pretendus is similar to that of the baker’s yeast S. cerevisae . This includes a sexual life-cycle where single-celled haploid parent cells fuse to create a transient diploid, which then undergoes meiosis to produce haploid spores.
The amount of Compound C was measured for the progeny of each cross. The progeny of each cross were classified into classes based on the amount of Compound C observed (Class A 0-200 units; Class B 200-400 units; Class C 400-600 units; Class D 600-1000 units; Class E More than 1000-1400 units). The number of each progeny class produced for each pair-wise cross are shown in Figure 3.2.
Your task is to:
- Analyse the data provided which shows the number of progeny observed for each phenotypic class for each cross and explain the linkage relationships between the three genes encoding Regulators N, M and L
- Describe which of the proposed models for how Regulators N, M and L interact to contribute to Enzyme 1 [removed]Figure 3.3) is best supported by the data.
Section 4: Investigating gene regulation controlling Compound C metabolism
The research lab is trying to find ways to decrease the amount of Compound C that is produced. They have previously determined that the activity of Regulator N is decreased in the presence of Compound B, resulting in increased expression of the enz1 gene. The activity of regulator N is also also controlled by other factors so that it is activated even in the presence of Compound B. This involves the Regulator W protein, whose gene expression is also negatively regulated by RegN.
- In the absence of Compound B, Regulator N exists as a monomer and is active and able to block the expression of the enz1 and regW The regN gene is constitutively expressed at a low level.
- In the presence of Compound B, Regulator N forms a stable homodimer and is inactive. This leads to the enz1 gene being expressed at a high level and the regW gene at a low level.
- As the Regulator W increases in concentration it destabilises RegN dimers. As RegW works in opposition to Compound B. This leads to some active RegN monomers and some inactive RegN dimers, resulting in a steady state low level of enz1 and regW
The researchers want to investigate whether this model is accurate and how they can create mutants that will lead to a reduced amount of Compound C. Two mutants have been created.
Mutant 1 contains a mutation in the 6bp RegN DNA binding site located in the promoter of the enz1 gene.
Mutant 2 is a null mutant in the gene coding for the Regulator N protein ( regN – ).
Your task is to:
- Predict the frequency with which you would expect to isolate each of these mutations and explain your reasoning.
- Predict how each of the mutations would influence the expression of the genes required for the production of Compound C under different conditions and which, if any, of these mutations would reduce Compound C production.