Wolfram Language & System Documentation Center

BioSequenceModify

See Also
- BioSequence
- BioSequenceTranslate
- BioSequenceBackTranslateList
- Entity Types
- GeneticTranslationTable
Related Guides
- Biomolecular Sequences
- See Also
  - BioSequence
  - BioSequenceTranslate
  - BioSequenceBackTranslateList
  - Entity Types
  - GeneticTranslationTable
- Related Guides
  - Biomolecular Sequences

BioSequenceModify

BioSequenceModify[seq,"mod"]

gives the result of applying the modification "mod" to the sequence seq.

BioSequenceModify[seq,{"mod",params}]

specifies the parameters params for "mod".

BioSequenceModify[modspec]

represents an operator form of BioSequenceModify that can be applied to a biomolecular sequence.

Details

Bond modifications:

	{"AddBond",{i₁,i₂}}	add a higher-order bond between letters at i₁ and i₂
	{"AddBond",Bond[{i₁,i₂},"type"]}	add a bond of the given type between the given indices
	{"DeleteBond",{i₁,i₂}}	remove all higher-order bonds between the given indices
	{"DeleteBond",Bond[{i₁,i₂},"type"]}	remove the specified bond between the given indices

Circularity adjustment modifications:

	"MakeCircular"	convert a linear sequence into a circular sequence
	"MakeLinear"	convert a circular sequence into a linear sequence
	{"MakeLinear",i}	convert to a linear sequence, starting at the i position

Collection modifications:

	{"AddToCollection",{seq₁,seq₂,…}}	incorporate a list of sequences into a sequence collection
	"SplitDisconnectedCollection"	separate unbonded clusters into separate collections

Representation-only modifications:

	"InnermostBondRepresentation"	represent bonds at the innermost applicable sequence
	"OutermostBondRepresentation"	represent bonds at the outermost sequence
	"CanonicalRepresentation"	convert all sequences and bonds to a canonical form

Translation modifications:

	"DropIncompleteCodons"	drop incomplete codons from the end of DNA or RNA
	"DropToStartCodon"	drop letters from DNA or RNA until a start codon is found
	"DropFromStopLetter"	drop the letters from a peptide after a stop letter is found

Examples

open all close all

Basic Examples (5)

Add a bond to a sequence:

Wolfram Language code:

BioSequenceModify[
	BioSequence["RNA", "AGGGU"], 
	{"AddBond", Bond[{1, 5}, "MultiHydrogen"]}
	]

Delete a bond from a sequence:

Wolfram Language code:

BioSequenceModify[
	BioSequence["RNA", "AGGGU", {Bond[{1, 5}, "MultiHydrogen"]}], 
	{"DeleteBond", Bond[{1, 5}, "MultiHydrogen"]}
	]

Represent all bonds at the innermost sequence, with all letters included:

Wolfram Language code:

BioSequenceModify[
	BioSequence["HybridStrand", 
	{BioSequence["DNA", "CAGT"], BioSequence["RNA", "GUA"]}, 
	{Bond[{{1, 2}, {1, 4}}, "MultiHydrogen"]}
	], 
	"InnermostBondRepresentation"
	]//InputForm

Represent all bonds at the outermost sequence:

Wolfram Language code:

BioSequenceModify[
	BioSequence["HybridStrand", {BioSequence["DNA", "CAGT", {Bond[{2, 4}, "MultiHydrogen"]}], 
  BioSequence["RNA", "GUA", {}]}, {}], 
	"OutermostBondRepresentation"
	]//InputForm

Canonicalize the representation of bonds and sequences into a sorted and reduced form:

Wolfram Language code:

BioSequenceModify[
	BioSequence[{BioSequence["DNA", "GGGG"], BioSequence["DNA", "CCCC"]}, 
	{Bond[{{1, 1}, {2, 2}}, "MultiHydrogen"]}
	], 
	"CanonicalRepresentation"
	]//InputForm

Scope (30)

Convert a linear sequence to a circular sequence:

Wolfram Language code:

BioSequenceModify[
	BioSequence["DNA", "TGGACTTTC", {}], 
	"MakeCircular"
	]

Convert a circular sequence to a linear sequence:

Wolfram Language code:

BioSequenceModify[
	BioSequence["CircularDNA", "TGGACTTTC", {}], 
	"MakeLinear"
	]

Add a list of sequences into a sequence collection:

Wolfram Language code:

BioSequenceModify[
	BioSequence[
	{BioSequence["RNA", "AUU"], 
	BioSequence["DNA", "GGC"]}
	], 
	{"AddToCollection", {BioSequence["RNA", "GCA"], BioSequence["DNA", "TTC"]}}
	]

Separate the unbound components of a sequence collection into separate collections:

Wolfram Language code:

BioSequenceModify[
	BioSequence[
	{BioSequence["HybridStrand", {"CAGT", "GUA"}], 
	BioSequence["DNA", "GGC"], 
	BioSequence["HybridStrand", {"CAU", "ATTCG"}]}, 
	Bond[{{1, 2, 1}, {3, 1, 1}}, "MultiHydrogen"]
	], 
	"SplitDisconnectedCollection"
	]

Drop letters at the end of a nucleotide sequence so only complete codons are present for translation:

Wolfram Language code: BioSequenceModify[BioSequence["DNA", "ACTGATATAC"], "DropIncompleteCodons"]

Drop the letters up to a start codon in the default genetic translation table:

Wolfram Language code: BioSequenceModify[BioSequence["DNA", "ACTGATATAC"], "DropToStartCodon"]

Drop terms after the stop letter in a peptide sequence:

Wolfram Language code: BioSequenceModify[BioSequence["Peptide", "MGLSDGEWQ.LVLNVWG"], "DropFromStopLetter"]

"AddBond" (4)

A bond type does not need to be specified to insert a bond. If one is not given, it will be inferred:

Wolfram Language code:

BioSequenceModify[
	BioSequence["RNA", "GAGGUGG"], 
	{"AddBond", {2, 5}}
	]//InputForm

Wolfram Language code: BioSequencePlot[%]

The type of the bond inferred may depend on the letters being linked:

Wolfram Language code:

BioSequenceModify[
	BioSequence["Peptide", "CGGGU"], 
	{"AddBond", {1, 5}}
	]//InputForm

Wolfram Language code:

BioSequenceModify[
	BioSequence["Peptide", "DGGGK"], 
	{"AddBond", {1, 5}}]//InputForm

Bonds can be added to hybrid strands:

Wolfram Language code:

BioSequenceModify[
	BioSequence["HybridStrand", {BioSequence["RNA", "GAGG"], BioSequence["DNA", "GTGG"]}], 
	{"AddBond", {{1, 2}, {2, 2}}}
	]

Wolfram Language code: BioSequencePlot[%]

Bonds can be added to sequence collections:

Wolfram Language code:

BioSequenceModify[
	BioSequence[{BioSequence["RNA", "GAGG"], BioSequence["DNA", "GTGG"]}], 
	{"AddBond", {{1, 2}, {2, 2}}}
	]

Wolfram Language code: BioSequencePlot[%]

"AddToCollection" (3)

A single sequence can also be added to a collection:

Wolfram Language code:

BioSequenceModify[
	BioSequence[
	{BioSequence["RNA", "AUU"], 
	BioSequence["DNA", "GGC"]}
	], 
	{"AddToCollection", BioSequence["RNA", "GCA"]}
	]

A single motif or hybrid sequence will be modified into a collection:

Wolfram Language code:

BioSequenceModify[
BioSequence["RNA", "AUU"]
, 
{"AddToCollection", BioSequence["RNA", "GCA"]}
]

If there are multiple sequence collections, they will be merged in the result:

Wolfram Language code:

BioSequenceModify[
	BioSequence[
	{BioSequence["RNA", "AUU"], 
	BioSequence["DNA", "GGC"]}
	], 
	{"AddToCollection", BioSequence[{BioSequence["RNA", "GCA"], BioSequence["DNA", "TTC"]}]}
	]

"CanonicalRepresentation" (3)

If sequences are identical, canonicalization will use strand-level bonds for ordering:

Wolfram Language code:

BioSequenceModify[
	BioSequence[
	{BioSequence["DNA", "NNNN", Bond[{2, 4}]], BioSequence["DNA", "NNNN", Bond[{1, 3}]]}
	], 
	"CanonicalRepresentation"
	]["SequenceBondList"]

If the sequences and strand-level bonds are identical, canonicalization will use sequence bonds for ordering:

Wolfram Language code:

BioSequenceModify[
	BioSequence[
	{BioSequence["DNA", "NNNNNN", Bond[{5, 6}, "MultiHydrogen"]], BioSequence["DNA", "NNNNNN", Bond[{5, 6}, "MultiHydrogen"]]}, 
	{Bond[{{1, 4}, {2, 3}}, "MultiHydrogen"]}
	], 
	"CanonicalRepresentation"
	]["SequenceBondList"]

In addition to sorting, single-strand collections are reduced to the strand and single motif hybrids are reduced to the motif:

Wolfram Language code:

BioSequenceModify[
	BioSequence[{BioSequence["HybridStrand", {BioSequence["DNA", "CAGT"]}]}, 
	{Bond[{{1, 1, 2}, {1, 1, 4}}, "MultiHydrogen"]}
	], 
	"CanonicalRepresentation"
	]

"DeleteBond" (2)

Delete all higher-order bonds between the two indexes:

Wolfram Language code:

BioSequenceModify[
	BioSequence["RNA", "AGGGU", {Bond[{1, 5}, "MultiHydrogen"]}], 
	{"DeleteBond", {1, 5}}
	]

Deleting bonds always works on the outermost form, which is the form given by the "SequenceBondList" property:

Wolfram Language code:

nestedExample = BioSequence[{BioSequence["HybridStrand", 
	{BioSequence["DNA", "CAGT", Bond[{2, 4}, "MultiHydrogen"]], 
	BioSequence["RNA", "GUA"]
	}
	]}]

Wolfram Language code: nestedExample["SequenceBondList"]

Wolfram Language code:

BioSequenceModify[
	BioSequence[nestedExample], 
	{"DeleteBond", {{1, 1, 2}, {1, 1, 4}}}
	]

"DropToStartCodon" (3)

Any genetic translation table entity can be used to specify start codons:

Wolfram Language code:

BioSequenceModify[BioSequence["DNA", "ACTGATATAC"], {"DropToStartCodon", Entity["GeneticTranslationTable", "VertebrateMitochondrial"]}]

A specific codon or list of codons can be used as the start codon specification:

Wolfram Language code: BioSequenceModify[BioSequence["DNA", "ACTGATATAC"], {"DropToStartCodon", "ATA"}]

Wolfram Language code: BioSequenceModify[BioSequence["DNA", "ACTGATATAC"], {"DropToStartCodon", {"GAT", "ATA"}}]

Modifications can be created in an operator form:

Wolfram Language code: BioSequence["DNA", "ACTGATATAC"]//BioSequenceModify["DropToStartCodon"]

Modifications with further specifications can also be used in operator form:

Wolfram Language code:

BioSequence["DNA", "ACTGATATAC"]//BioSequenceModify[{"DropToStartCodon", Entity["GeneticTranslationTable", "VertebrateMitochondrial"]}]

"InnermostBondRepresentation" (1)

Moving bonds inward can potentially bring them into the motif from several layers:

Wolfram Language code:

BioSequenceModify[
	BioSequence[{BioSequence["HybridStrand", {BioSequence["DNA", "CAGT"], "U"}], BioSequence["HybridStrand", {BioSequence["RNA", "GUA"], "T"}]}, 
	{Bond[{{1, 1, 2}, {1, 1, 4}}, "MultiHydrogen"]}
	], 
	"InnermostBondRepresentation"
	]//InputForm

"MakeCircular" (2)

RNA sequences can be converted to circular RNA sequences:

Wolfram Language code:

BioSequenceModify[
	BioSequence["RNA", "UUGUAGUUA", {}], 
	"MakeCircular"]

Peptide sequences can be converted to circular peptide sequences:

Wolfram Language code:

BioSequenceModify[
	BioSequence["Peptide", "MESLVPGFNEKTHVQL", {}], 
	"MakeCircular"]

"MakeLinear" (3)

Circular RNA sequences can be converted to linear RNA sequences:

Wolfram Language code:

BioSequenceModify[
	BioSequence["CircularRNA", "UUGUAGUUA", {}], 
	"MakeLinear"]

Circular peptide sequences can be converted to linear peptide sequences:

Wolfram Language code:

BioSequenceModify[
	BioSequence["CircularPeptide", "MESLVPGFNEKTHVQL", {}], 
	"MakeLinear"]

Start the linear sequence from a specific position:

Wolfram Language code:

BioSequenceModify[
	BioSequence["CircularPeptide", "MESLVPGFNEKTHVQL", {}], 
	{"MakeLinear", 7}]

Relative bond positions are preserved when converting circular sequences to linear sequences:

Wolfram Language code:

BioSequenceModify[
	BioSequence["CircularPeptide", "MESLVPGFNEKTHVQL", {Bond[{2, 11}]}], 
	{"MakeLinear", 7}]//InputForm

"OutermostBondRepresentation" (1)

Moving bonds inward will bring them from inside any motif or strand to the outermost sequence structure:

Wolfram Language code:

BioSequenceModify[
	BioSequence[{BioSequence["HybridStrand", 
	{BioSequence["DNA", "CAGT", Bond[{2, 4}, "MultiHydrogen"]], 
	BioSequence["RNA", "GUA"]
	}
	]}], 
	"OutermostBondRepresentation"
	]//InputForm

"SplitDisconnectedCollection" (1)

Splitting connections will renumber bonds based on new collection memberships:

Wolfram Language code:

(#["SequenceBondList"])& /@ BioSequenceModify[BioSequence[{BioSequence["DNA", "A"], 
	BioSequence["RNA", "C"], 
	BioSequence["DNA", "G"], 
	BioSequence["DNA", "T"]}, 
	{Bond[{{1, 1}, {4, 1}}], Bond[{{2, 1}, {3, 1}}]}], "SplitDisconnectedCollection"]

Applications (1)

A circular peptide:

Wolfram Language code:

circ = BioSequence["CircularPeptide", "QTGGSFFEPFNSYNSGTWEKADGYSNGGVFNCTWRANNVNFTNDGKLKLGLTSSAYNKFDCAEYRST\
NIYGYGLYEVSMKPAKNTGIVSSFFTYTGPAHGTQWDEIDIEFLGKDTTKVQFNYYTNGVGGHEKVISLGFDASKGFHTYAFDWQPGYIKWYVDGVLKH\
TATANIPSTPGKIMMNLWNGTGVDDWLGSYNGANPLYAEYDWVKYTSN", {}];

Various peptides can be related to each other through a circular permutation:

Wolfram Language code: BioSequenceModify[circ, "MakeLinear"]

Wolfram Language code: % === BioSequence[First[Import["https://www.rcsb.org/fasta/entry/2AYH", "FASTA"]]]

Wolfram Language code: BioSequenceModify[circ, {"MakeLinear", 84}]

Wolfram Language code: % === BioSequence[First[Import["https://www.rcsb.org/fasta/entry/1AJK", "FASTA"]]]

Wolfram Language code: BioSequenceModify[circ, {"MakeLinear", 127}]

Wolfram Language code: % === BioSequence[First[Import["https://www.rcsb.org/fasta/entry/1AJO", "FASTA"]]]

Possible Issues (2)

A given modification may not apply to a particular type of sequence:

Wolfram Language code: BioSequenceModify[BioSequence["DNA", "ACTGATATAC"], "DropFromStopLetter"]

If a bond type cannot be inferred, an untyped bond is added:

Wolfram Language code:

BioSequenceModify[
	BioSequence["Peptide", "DGGGC"], 
	{"AddBond", {1, 5}}]//InputForm

Neat Examples (1)

Represent the protein preproinsulin as a BioSequence:

Wolfram Language code:

preproinsulin = BioSequence["Peptide", "MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKTRREAEDLQVGQVELGGGPGAG\
SLQPLALEGSLQKRGIVEQCCTSICSLYQLENYCN", {}];

Remove the signal peptide sequence to make proinsulin:

Wolfram Language code: proinsulin = StringDrop[preproinsulin, 24]

Add the disulfide bonds and split the proinsulin sequence to make insulin:

Wolfram Language code:

insulin = BioSequenceModify[BioSequence[StringSplit[proinsulin, BioSequence["Peptide", "RREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKR", {}]]], 
	{"AddBond", 
	{Bond[{{2, 1, 6}, {2, 1, 11}}, "DisulfideBridges"], Bond[{{2, 1, 7}, {1, 1, 7}}, "DisulfideBridges"], Bond[{{2, 1, 20}, {1, 1, 19}}, "DisulfideBridges"]}
}]

Wolfram Language code: BioSequencePlot[insulin]

Top

More Learning

Tech Support

Wolfram Solutions

Wolfram Solutions For Education

Get Started

Grow Your Skills

Work with Us

Educational Programs for Adults

Educational Programs for Youth

Read

BioSequenceModify

Details

Examples

Basic Examples (5)

Scope (30)

"AddBond" (4)

"AddToCollection" (3)

"CanonicalRepresentation" (3)

"DeleteBond" (2)

"DropToStartCodon" (3)

"InnermostBondRepresentation" (1)

"MakeCircular" (2)

"MakeLinear" (3)

"OutermostBondRepresentation" (1)

"SplitDisconnectedCollection" (1)

Applications (1)

Possible Issues (2)

Neat Examples (1)

Text

CMS

APA

BibTeX

BibLaTeX

BioSequenceModify

Details

Examples

Basic Examples (5)

Scope (30)

"AddBond" (4)

"AddToCollection" (3)

"CanonicalRepresentation" (3)

"DeleteBond" (2)

"DropToStartCodon" (3)

"InnermostBondRepresentation" (1)

"MakeCircular" (2)

"MakeLinear" (3)

"OutermostBondRepresentation" (1)

"SplitDisconnectedCollection" (1)

Applications (1)

Possible Issues (2)

Neat Examples (1)

See Also

Related Guides

History

Text

CMS

APA

BibTeX

BibLaTeX