This module handles the core definition of phylogenetic character data.
A container for that holds the value for a particular cell in a matrix.
An annotable dictionary with Taxon objects as keys and CharacterDataVectors objects as values.
Extends this matrix by adding taxa and characters from the given matrix to this one. If overwrite_existing is True and a taxon in the other matrix is already present in the current one, then the sequence associated with the taxon in the second matrix replaces the sequence in the current one. If extend_existing is True and a taxon in the other matrix is already present in the current one, then the squence associated with the taxon in the second matrix will be added to the sequence in the current one. If both are True, then an exception is raised. If neither are True, and a taxon in the other matrix is already present in the current one, then the sequence is ignored. Note that the containing CharacterMatrix taxa has to be normalized after this operation.
Extends this matrix by adding characters from sequences of taxa in given matrix to sequences of taxa with correspond labels in this one. Taxa in the second matrix that do not exist in the current one are ignored.
Returns number of characters in first sequence
A list of character data values for a taxon – a row of a Character Matrix.
The CharacterDataVector typically contains elements that are instances of CharacterDataCell
Sets the cell of a cell at a particular position.
Character data container/manager manager.
__init__ calls TaxonSetLinked.__init__ for handling of oid, label and taxon_set keyword arguments.
Can be initialized with:
- source keyword arguments (see Readable.process_source_kwargs), or
- a single unnamed CharacterMatrix instance (which will be deep-copied).
Adds a CharacterSubset object. Raises an error if one already exists with the same label.
Deletes all items from the character map dictionary.
TODO: may need to check that we are not overwriting oid
Creates and returns a single character matrix from multiple CharacterMatrix objects specified as a list, ‘char_matrices’. All the CharacterMatrix objects in the list must be of the same type, and share the same TaxonSet reference. All taxa must be present in all alignments, all all alignments must be of the same length. Component parts will be recorded as character subsets.
Read a character matrix from each file path given in paths, assuming data format/schema schema, and passing any keyword arguments down to the underlying specialized reader. Merge the and return the combined character matrix. Component parts will be recorded as character subsets.
Read a character matrix from each file object given in streams, assuming data format/schema schema, and passing any keyword arguments down to the underlying specialized reader. Merge the character matrices and return the combined character matrix. Component parts will be recorded as character subsets.
Returns a dictionary that maps taxon objects to lists of sets of state indices if char_indices is not None it should be a iterable collection of character indices to include.
Returns description of object, up to level depth.
Returns a new CharacterMatrix (of the same type) consisting only of columns given by the 0-based indices in indices. Note that this new matrix will still reference the same taxon set.
Returns a new CharacterMatrix (of the same type) consisting only of columns given by the CharacterSubset, character_subset. Note that this new matrix will still reference the same taxon set.
Extends this matrix by adding taxa and characters from the given matrix to this one. If overwrite_existing is True and a taxon in the other matrix is already present in the current one, then the sequence associated with the taxon in the second matrix replaces the sequence in the current one. If extend_existing is True and a taxon in the other matrix is already present in the current one, then the sequence associated with the taxon in the second matrix will be added to the sequence in the current one. If both are True, then an exception is raised. If neither are True, and a taxon in the other matrix is already present in the current one, then the sequence is ignored.
Extends this matrix by adding characters from sequences of taxa in given matrix to sequences of taxa with correspond labels in this one. Taxa in the second matrix that do not exist in the current one are ignored.
Extends this matrix by adding taxa and characters from the given map to this one. If overwrite_existing is True and a taxon in the other map is already present in the current one, then the sequence associated with the taxon in the second map replaces the sequence in the current one. If extend_existing is True and a taxon in the other matrix is already present in the current one, then the squence map with the taxon in the second map will be added to the sequence in the current one. If both are True, then an exception is raised. If neither are True, and a taxon in the other map is already present in the current one, then the sequence is ignored.
Gets an item from character map by its key, returning default if key not present.
Returns true if character map has key, regardless of case.
Returns dictionary of element id to corresponding character definition.
Returns character map key, value pairs in key-order.
Returns an iterator over character map’s values.
Dictionary interface implementation for direct access to character map.
Dictionary interface implementation for direct access to character map.
Returns a copy of the ordered list of character map keys.
Defines a set of character (columns) that make up a character set. Raises an error if one already exists with the same label. Column indices are 0-based.
a.popitem() remove and last (key, value) pair
Removes given taxa from matrix. If preserve_taxon_set is True, then the taxa are removed from the associated TaxonSet object as well. Otherwise this is not modified (default).
Populates objects of this type from schema-formatted data in the file-like object source stream, replacing all current data. If multiple character matrices are in the data source, a 0-based index of the character matrix to use can be specified using the matrix_offset keyword (defaults to 0, i.e., first character matrix).
Synchronizes Taxon objects of map to taxon_set of self.
Sets the default value to return if key not present.
Updates local taxa block by adding taxa not already managed. Mainly for use after map extension
Returns list of values.
Returns number of characters in first sequence
Returns list of vectors.
Writes out this object’s data to a file-like object opened for writing stream.
Tracks definition of a subset of characters.
Keyword arguments:
label: name of this subset
- character_indices: list of 0-based (integer) indices
of column positions that constitute this subset.
A character format or type of a particular column: i.e., maps a particular set of character state definitions to a column in a character matrix.
Character data container/manager manager.
See CharacterMatrix.__init__ documentation
Character data container/manager manager.
That adds the attributes self.state_alphabets (a list of alphabets) and self.default_state_alphabet
See CharacterMatrix.__init__ documentation for kwargs.
Unnamed args are passed to clone_from.
DNA nucleotide data.
See CharacterMatrix.__init__ documentation for kwargs.
Unnamed args are passed to clone_from.
Infinite sites data.
See CharacterMatrix.__init__ documentation for kwargs.
Unnamed args are passed to clone_from.
Generic nucleotide data.
Inits. Handles keyword arguments: oid, label and taxon_set.
Protein / amino acid data.
Inits. Handles keyword arguments: oid, label and taxon_set.
Restriction sites data.
See CharacterMatrix.__init__ documentation for kwargs.
Unnamed args are passed to clone_from.
RNA nucleotide data.
See CharacterMatrix.__init__ documentation for kwargs.
Unnamed args are passed to clone_from.
Tracks distinct site patterns in a character matrix. Useful for efficient computations.
standard data.
See CharacterMatrix.__init__ documentation for kwargs.
Unnamed args are passed to clone_from.
Extends this matrix by adding taxa and characters from the given matrix to this one. If overwrite_existing is True and a taxon in the other matrix is already present in the current one, then the sequence associated with the taxon in the second matrix replaces the sequence in the current one. If extend_existing is True and a taxon in the other matrix is already present in the current one, then the sequence associated with the taxon in the second matrix will be added to the sequence in the current one. If both are True, then an exception is raised. If neither are True, and a taxon in the other matrix is already present in the current one, then the sequence is ignored.
A list of states available for a particular character type/format.
Returns list of ambiguous states of this alphabet
Returns list of fundamental states of this alphabet
Returns state in self in which attr_name equals value.
Returns list of states with ids/symbols/tokens equal to values given in a list of ids/symbols/tokens (exact matches, one-to-one correspondence between state and attribute value in list).
Returns (plain) list of CharacterDataCell objects with values set to states corresponding to symbols given by symbols.
Returns CharacterDataVector object, with member CharacterDataCell objects with values set to states corresponding to symbols given by symbols. If taxon is given in keyword arguments, its value will be assigned to the taxon property of the CharacterDataVector.
Returns dictionary of element id’s to state objects.
Returns True if the Alphabet has an element designated as the gap “state” and el is this element.
Returns SINGLE state that has ids/symbols/tokens as member states.
Returns list of multistate states of this alphabet
Returns list of ambiguous states of this alphabet
Returns a StateAlphabetElement object corresponding to given symbol.
Returns index of the StateAlphabetElement object corresponding to the given symbol.
Returns dictionary with symbols as keys and StateAlphabetElement objects as values.
A character state definition, which can either be a fundamental state or a mapping to a set of other character states (for polymorphic or ambiguous characters).
Returns set of id’s of all _get_fundamental states to which this state maps.
Returns value of self in terms of a set of _get_fundamental states (i.e., set of single states) that correspond to this state.
Returns set of symbols of all _get_fundamental states to which this state maps.
Returns set of tokens of all _get_fundamental states to which this state maps.