|
using | sdsl_index_type = sdsl_index_type_ |
| The type of the underlying SDSL index.
|
|
using | sdsl_char_type = typename sdsl_index_type::alphabet_type::char_type |
| The type of the reduced alphabet type. (The reduced alphabet might be smaller than the original alphabet in case not all possible characters occur in the indexed text.)
|
|
using | sdsl_sigma_type = typename sdsl_index_type::alphabet_type::sigma_type |
| The type of the alphabet size of the underlying SDSL index.
|
|
using | alphabet_type = alphabet_t |
| The type of the underlying character of the indexed text.
|
|
using | size_type = typename sdsl_index_type::size_type |
| Type for representing positions in the indexed text.
|
|
using | cursor_type = fm_index_cursor< fm_index > |
| The type of the (unidirectional) cursor.
|
|
template<semialphabet alphabet_t, text_layout text_layout_mode_, detail::sdsl_index sdsl_index_type_ = default_sdsl_index_type>
class seqan3::fm_index< alphabet_t, text_layout_mode_, sdsl_index_type_ >
The SeqAn FM Index.
- Template Parameters
-
alphabet_t | The alphabet type; must model seqan3::semialphabet. |
text_layout_mode_ | Indicates whether this index works on a text collection or a single text. See seqan3::text_layout. |
sdsl_index_type_ | The type of the underlying SDSL index, must model seqan3::sdsl_index. |
The seqan3::fm_index is a fast and space-efficient string index to search strings and collections of strings.
General information
Here is a short example on how to build an index and search a pattern using an cursor. Please note that there is a very powerful search module with a high-level interface seqan3::search that encapsulates the use of cursors.
int main()
{
using seqan3::operator""_dna4;
auto cur =
index.cursor();
cur.extend_right("AAGG"_dna4);
for (auto && pos : cur.locate())
return 0;
}
The SeqAn FM Index.
Definition: fm_index.hpp:194
sdsl_index_type index
Underlying index from the SDSL.
Definition: fm_index.hpp:212
Provides seqan3::debug_stream and related types.
Provides seqan3::dna4, container aliases and string literals.
debug_stream_type debug_stream
A global instance of seqan3::debug_stream_type.
Definition: debug_stream.hpp:42
Meta-header for the FM index module.
- Attention
- When building an index for a single text over any alphabet, the symbol with rank 255 is reserved and may not occur in the text.
Here is an example using a collection of strings (e.g. a genome with multiple chromosomes or a protein database):
int main()
{
using seqan3::operator""_dna4;
"TAGCTGAAGCCATTGGCATCTGATCGGACT"_dna4,
"ACTGAGCTCGTC"_dna4,
"TGCATGCACCCATCGACTGACTG"_dna4,
"GTACGTACGTTACG"_dna4};
auto cur =
index.cursor();
cur.extend_right("CTGA"_dna4);
for (auto && pos : cur.locate())
return 0;
}
- Attention
- When building an index for a text collection over any alphabet, the symbols with rank 254 and 255 are reserved and may not be used in the text.
Choosing an index implementation
The underlying implementation of the FM Index (rank data structure, sampling rates, etc.) can be specified by passing a new SDSL index type as second template parameter:
- Todo:
- Link to SDSL documentation or write our own once SDSL3 documentation is available somewhere....