Package org.snpeff.fileIterator
Class FastaFileIterator
- java.lang.Object
-
- org.snpeff.fileIterator.FileIterator<java.lang.String>
-
- org.snpeff.fileIterator.FastaFileIterator
-
- All Implemented Interfaces:
java.lang.Iterable<java.lang.String>,java.util.Iterator<java.lang.String>
public class FastaFileIterator extends FileIterator<java.lang.String>
Opens a fasta file and iterates over all fasta sequences in the file- Author:
- pcingola
-
-
Field Summary
Fields Modifier and Type Field Description static char[]TRANSCRIPT_ID_SEPARATORSstatic java.lang.StringTRANSCRIPT_ID_SEPARATORS_REGEX
-
Constructor Summary
Constructors Constructor Description FastaFileIterator(java.lang.String fastaFileName)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description java.util.List<java.lang.String>fastaHeader2Ids()Try to parse IDs from a fasta headerjava.lang.StringgetHeader()Current sequence headerjava.lang.StringgetName()Sequence name (first 'word') It extracts the characters after the leading '>' and before the first space, then removes leading 'chr', 'chr:', etc.java.lang.StringgetTranscriptId()Get transcript name from FASTA header (ENSEMBL protein files) Format example: '>ENSP00000356130 pep:known chromosome:GRCh37:1:205111633:205180694:-1 gene:ENSG00000133059 transcript:ENST00000367162'protected java.lang.StringreadNext()Read a sequence from the file-
Methods inherited from class org.snpeff.fileIterator.FileIterator
close, countNewLineChars, getFilePointer, getLine, getLineNum, guessNewLineChars, hasNext, hasSeek, init, isDebug, iterator, load, next, readLine, ready, remove, seek, setAutoClose, setDebug, setVerbose, toString
-
-
-
-
Method Detail
-
fastaHeader2Ids
public java.util.List<java.lang.String> fastaHeader2Ids()
Try to parse IDs from a fasta header
-
getHeader
public java.lang.String getHeader()
Current sequence header
-
getName
public java.lang.String getName()
Sequence name (first 'word') It extracts the characters after the leading '>' and before the first space, then removes leading 'chr', 'chr:', etc.
-
getTranscriptId
public java.lang.String getTranscriptId()
Get transcript name from FASTA header (ENSEMBL protein files) Format example: '>ENSP00000356130 pep:known chromosome:GRCh37:1:205111633:205180694:-1 gene:ENSG00000133059 transcript:ENST00000367162'
-
readNext
protected java.lang.String readNext()
Read a sequence from the file- Specified by:
readNextin classFileIterator<java.lang.String>
-
-