checking whether a document is a nucleotide or amino acid sequence

I'm writing a plugin that can take either nucleotide or amino acid sequences as input. If a nucleotide sequence is input, it gets translated as the first step.

The plugin only acts on one document at a time.

What is the most robust to test whether the input document is a nucleotide or amino acid sequence?

My current solution is:

if (documents[0] instanceof NucleotideSequenceDocument) {
//Do nucleotide stuff
}
else if (documents[0] instanceof AminoAcidSequenceDocument) {
//Do amino acid stuff.
}

That works most of the time, but breaks on certain documents. Specifically, it breaks on a document which the debugger tells me is of type:

com.biomatters.plugins.ncbi.documents.GenBankNucleotideSequence

There must be a way to fix this, because, the signature selection function is able to figure it out, and works just fine. It would be nice to know what it's doing behind the scenes, so I can do the same thing.

public DocumentSelectionSignature[] getSelectionSignatures() {
        return new DocumentSelectionSignature[] {
                new DocumentSelectionSignature(AminoAcidSequenceDocument.class,1,1)
                ,new DocumentSelectionSignature(NucleotideSequenceDocument.class,1,1)
                // This indicates this annotation generator will accept a single nucleotide sequence or a single amino acid as input
        };
    }

try {
   inputAASequence = (AminoAcidSequenceDocument) documents[0].getDocument();
   doctype = "amino acid";
}
catch (Exception e){
       try {
           inputNtSequence = (NucleotideSequenceDocument) documents[0].getDocument();
           doctype = "nucleotide";
       }
       catch (Exception e){
           throw (new DocumentOperationException("Cannot determine the type of the input document"));
       }
}

if (DocumentType.isAminoAcidSequence(documents[0].getDocument())) {
   //do amino acid stuff
}
else if (DocumentType.isNucleotideSequence(documents[0].getDocument())) {
   //do nucleotide stuff
}
else {
   throw (new DocumentOperationException("Input document must either be a nucleotide sequence or an amino acid sequence."));
}

Sean Johnson

January 03, 2019 16:27

I figured out a solution that seems to work. There may be better ways to do this, but what I'm doing now is just doing try-catch blocks around the cast operations.

For example:

January 03, 2019 18:32

I found another way to do this. I presume that this way is actually the preferred way.

Richard Moir

January 04, 2019 01:47

I can confirm you found the right way to do it :)

checking whether a document is a nucleotide or amino acid sequence

Comments

Didn't find what you were looking for?