MoleculeFingerprint[mol]
returns a data structure representing features of the molecule mol.
MoleculeFingerprint[mol,method]
uses the specified method to generate a fingerprint for mol.
MoleculeFingerprint[mol,method,"format"]
gives a fingerprint in the specified format.
MoleculeFingerprint[{mol1,mol2,…},…]
gives the fingerprint for each of the moli.
MoleculeFingerprint
MoleculeFingerprint[mol]
returns a data structure representing features of the molecule mol.
MoleculeFingerprint[mol,method]
uses the specified method to generate a fingerprint for mol.
MoleculeFingerprint[mol,method,"format"]
gives a fingerprint in the specified format.
MoleculeFingerprint[{mol1,mol2,…},…]
gives the fingerprint for each of the moli.
Details
- MoleculeFingerprint generates a hash‐like vector encoding the chemical structure and features of a Molecule for similarity or machine‐learning purposes.
- Some fingerprint methods use a predefined set of molecule patterns, known as substructure keys, with each element in the output corresponding to a particular key. Other methods generate substructures algorithmically and hash these substructures to determine which element to set.
- Possible settings for method include:
-
"AtomPairs" all pairs of atoms "ExtendedConnectivity" substructures defined as radial atom neighborhoods "MACCSKeys" "Molecular ACCess System", a set of 166 predefined keys "Topological" all connected subgraphs with bond count in a given range {"SubstructureKey",{patt1,…}} user‐supplied substructure keys, given as MoleculePattern objects "FP3SubstructureKeys" a set of 55 pre-defined keys for organic functional groups "FP4SubstructureKeys" 304 diverse functional groups defined via SMARTS patterns {"method",optval,…} a specified method with method‐specific options - MoleculeFingerprint[mol] is equivalent to MoleculeFingerprint[mol,"ExtendedConnectivity"].
- Possible values for "format" include:
-
"CountVector" an Association of features and the number of times they are found in the molecule "NumericArray" a NumericArray of ones and zeros "BitVector" a "BitVector" data structure "SparseArray" a SparseArray object "List" a List of ones and zeros - All output formats except for "CountVector" contain equivalent data, a list of 1s and 0s indicating the presence or absence of a molecule feature.
- The "CountVector" output format records not just the presence of a feature but also how many times the feature occurs. For "AtomPairs", "ExtendedConnectivity", "MACCSKeys", "Topological", and "TopologicalTorsion" fingerprint types, the features are integers. For substructure key fingerprints, the features are MoleculePattern objects.
- MoleculeFingerprint[mol,method] is equivalent to MoleculeFingerprint[mol,method,"List"].
- Atom pairs are molecular substructures defined by two atoms and the number of bonds along the shortest path between them. Each distinct pair of atoms is converted to an integer, which is then used to set a bit in the output vector. Alternatively, the integers are kept and used as the keys in an association whose values indicate the number of times a certain atom pair occurs in the molecule.
- "AtomPairs" fingerprints support the following method options:
-
"CountBounds" {1, 2, 4, 8} bounds to use for count simulation "CountSimulation" False whether to use count simulation "IncludeChirality" False whether to include chirality IncludeHydrogens False whether to include hydrogen atoms "MaxDistance" 30 largest distance to look for atom pairs "MinDistance" 1 smallest distance pairs can be separated by "Size" 2048 number of bits in numeric fingerprint - Extended connectivity fingerprints are also known as "circular fingerprints" or "Morgan fingerprints" and are generated using a variation on the "Morgan algorithm".
- Atom environments are generated by starting with a given atom and finding all atoms within a given radius. The atoms and bonds within that radius define the molecular feature that is converted to an integer, which is then used to set a bit in the output vector.
- "ExtendedConnectivity" fingerprints support the following method options:
-
"CountBounds" {1, 2, 4, 8} bounds to use for count simulation "CountSimulation" False whether to use count simulation IncludeHydrogens False whether to include hydrogen atoms "Radius" 2 the number of iterations to grow the fingerprint "Size" 2048 number of bits in numeric fingerprint "UseBondTypes" True whether to use bond types when generating bond invariants "UseFeatures" False whether to use chemical features like H-bond acceptor, donor, ionizable, etc. when generating atom invariants - Topological fingerprints are also known as "RDKit fingerprints" and are based on the daylight algorithm.
- In this fingerprinting method, all subgraphs within a given size range are enumerated. The hash for each substructure is determined by the atomic numbers and aromaticity of the included and, optionally, the bond types along the path.
- "Topological" fingerprints support the following method options:
-
"BitsPerFeature" 2 the number of bits that are set for each path or subgraph "CountBounds" {1, 2, 4, 8} bounds to use for count simulation "CountSimulation" False whether to use count simulation IncludeHydrogens False whether to include hydrogen atoms "MaxPathLength" 7 the longest path length "MinPathLength" 1 the smallest path length "Size" 2048 number of bits in numeric fingerprint "UseBondTypes" True whether to include bond types "UseBranchedPaths" True whether to use branched paths - User-supplied substructure keys can be given as a list of MoleculePattern objects or an Association whose values are molecule patterns.
Atom Pairs
Extended Connectivity
Toplological Fingerprints
Substructure Key Fingerprints
Examples
open all close allBasic Examples (3)
Get the fingerprint for a molecule:
MoleculeFingerprint[Molecule[{"C", "C", "C", Atom["C", "HydrogenCount" -> 1], "C", "C",
Atom["C", "HydrogenCount" -> 1], "C", "C", Atom["C", "HydrogenCount" -> 1], "C", "C"},
{Bond[{1, 2}, "Single"], Bond[{2, 3}, "Single"], Bond[{3, 4}, "Single"], Bond[{4, 5}, "S ... nterclockwise"],
Association["StereoType" -> "Tetrahedral", "ChiralCenter" -> 10,
"Direction" -> "Counterclockwise"], Association["StereoType" -> "DoubleBond",
"StereoBond" -> {5, 6}, "Ligands" -> {4, 7}, "Value" -> "Together"]}}]]//ShortGet the fingerprints for a list of molecules:
MoleculeFingerprint[{Molecule[{"O", "C", Atom["C", "HydrogenCount" -> 1], "O", Atom["C", "HydrogenCount" -> 1],
Atom["C", "HydrogenCount" -> 1], Atom["C", "HydrogenCount" -> 1], "N", "C", "O", "C", "O",
Atom["N", "FormalCharge" -> 1, "UnpairedElectronCount" -> 1] ... "Direction" -> "Counterclockwise"], Association["StereoType" -> "Tetrahedral",
"ChiralCenter" -> 6, "Direction" -> "Clockwise"], Association["StereoType" -> "Tetrahedral",
"ChiralCenter" -> 7, "Direction" -> "Counterclockwise"]}}], Molecule[{"C", "C", Atom["S", "UnpairedElectronCount" -> 1], "O", "C", "C", "C", "C", "C", "C",
"C"}, {Bond[{1, 2}, "Single"], Bond[{2, 3}, "Single"], Bond[{3, 4}, "Double"],
Bond[{3, 5}, "Single"], Bond[{5, 6}, "Single"], Bond[{6, 7}, "Singl ... ,
Bond[{7, 9}, "Single"], Bond[{7, 10}, "Single"], Bond[{3, 11}, "Single"],
Bond[{8, 5}, "Single"]}, {StereochemistryElements ->
{Association["StereoType" -> "Tetrahedral", "ChiralCenter" -> 3,
"Direction" -> "Counterclockwise"]}}], Molecule[{"O", "C", Atom["C", "HydrogenCount" -> 1], "O", Atom["C", "HydrogenCount" -> 1], "O",
"C", "C", "C", "C", "C", "C", "S", "C", "N", "C", "N", Atom["C", "HydrogenCount" -> 1], "C", "S",
"C", "O", "O", Atom["C", "HydrogenCount" -> 1], ... er" -> 24, "Direction" -> "Clockwise"], Association["StereoType" -> "Tetrahedral",
"ChiralCenter" -> 25, "Direction" -> "Counterclockwise"],
Association["StereoType" -> "Tetrahedral", "ChiralCenter" -> 26, "Direction" -> "Clockwise"]}}]}]//ShallowGet the MACCS key fingerprint for a molecule:
MoleculeFingerprint[Molecule[{Atom["O", "FormalCharge" -> -1], Atom["N", "FormalCharge" -> 1], "O", "C", "C", "C", "C",
"C", "C", "Cl", "C", "F", "F", "F"}, {Bond[{1, 2}, "Single"], Bond[{2, 3}, "Double"],
Bond[{2, 4}, "Single"], Bond[{4, 5}, "Aromatic"], Bond[{ ... ,
Bond[{6, 7}, "Aromatic"], Bond[{7, 8}, "Aromatic"], Bond[{8, 9}, "Aromatic"],
Bond[{9, 10}, "Single"], Bond[{6, 11}, "Single"], Bond[{11, 12}, "Single"],
Bond[{11, 13}, "Single"], Bond[{11, 14}, "Single"], Bond[{9, 4}, "Aromatic"]}, {}], "MACCSKeys"]Get the topological fingerprint for a molecule as a "BitVector":
MoleculeFingerprint[Molecule[{"N", "C", "O", "C", "N", "C", "O", Atom["C", "HydrogenCount" -> 1], "N", "C", "O",
Atom["C", "HydrogenCount" -> 1], "N", "C", "O", Atom["C", "HydrogenCount" -> 1], "C", "C", "C",
"C", "C", "C", "C", "O", "N", "C", "C", "C", "C", "C" ... "Direction" -> "Clockwise"], Association["StereoType" -> "Tetrahedral", "ChiralCenter" -> 12,
"Direction" -> "Counterclockwise"], Association["StereoType" -> "Tetrahedral",
"ChiralCenter" -> 16, "Direction" -> "Counterclockwise"]}}], "Topological", "BitVector"]Scope (2)
Find the atom‐pair fingerprints for a list of molecules:
MoleculeFingerprint[{Molecule[{"O", "C", "C", "C", "C", "O", "C", "C", "C", "C", "C", "C"},
{Bond[{1, 2}, "Double"], Bond[{2, 3}, "Single"], Bond[{3, 4}, "Double"], Bond[{3, 5}, "Single"],
Bond[{2, 6}, "Single"], Bond[{6, 7}, "Single"], Bond[{7, 8}, "Aromatic"],
Bond[{8, 9}, "Aromatic"], Bond[{9, 10}, "Aromatic"], Bond[{10, 11}, "Aromatic"],
Bond[{11, 12}, "Aromatic"], Bond[{12, 7}, "Aromatic"]}, {}], Molecule[{"C", "N", "C", "C", "N", "C", "C", "C", "C", "C", "N", "N", "C", "N", "C", "C", "N", "C",
"C", "C", "C", "C", "C", "C", "C", "Cl", "C", "C", "C"},
{Bond[{1, 2}, "Single"], Bond[{2, 3}, "Single"], Bond[{3, 4}, "Single"], Bond[{4, 5}, ... 5, 26}, "Single"], Bond[{25, 27}, "Aromatic"],
Bond[{27, 28}, "Aromatic"], Bond[{13, 29}, "Single"], Bond[{7, 2}, "Single"],
Bond[{17, 8}, "Single"], Bond[{23, 18}, "Aromatic"], Bond[{14, 10}, "Aromatic"],
Bond[{28, 15}, "Aromatic"]}, {}], Molecule[{"C", "C", "C", "C", "C", "C", "C", "N", "C", "C"},
{Bond[{1, 2}, "Single"], Bond[{2, 3}, "Single"], Bond[{2, 4}, "Single"], Bond[{4, 5}, "Single"],
Bond[{5, 6}, "Single"], Bond[{6, 7}, "Single"], Bond[{7, 8}, "Single"], Bond[{7, 9}, "Single"],
Bond[{7, 10}, "Single"], Bond[{8, 2}, "Single"]}, {}], Molecule[{"C", "C", "C", "O", "O", "C", "O", "C", "C", "C", "C"},
{Bond[{1, 2}, "Single"], Bond[{2, 3}, "Single"], Bond[{3, 4}, "Double"], Bond[{3, 5}, "Single"],
Bond[{5, 6}, "Single"], Bond[{6, 7}, "Double"], Bond[{6, 8}, "Single"], Bond[{8, 9}, "Single"],
Bond[{8, 10}, "Single"], Bond[{8, 11}, "Single"]}, {}], Molecule[{"O", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C"},
{Bond[{1, 2}, "Single"], Bond[{2, 3}, "Aromatic"], Bond[{3, 4}, "Aromatic"],
Bond[{4, 5}, "Aromatic"], Bond[{5, 6}, "Aromatic"], Bond[{6, 7}, "Aromatic"],
... "Aromatic"],
Bond[{10, 11}, "Aromatic"], Bond[{11, 12}, "Aromatic"], Bond[{12, 13}, "Aromatic"],
Bond[{13, 14}, "Aromatic"], Bond[{14, 15}, "Aromatic"], Bond[{7, 2}, "Aromatic"],
Bond[{13, 8}, "Aromatic"], Bond[{15, 6}, "Aromatic"]}, {}]}, "AtomPairs", "SparseArray"]Find the total number of bits in each fingerprint:
Total /@ %Find the atom‐pair fingerprint for a molecule and return as a count vector:
MoleculeFingerprint[Molecule[{"C", "N", "C", "C", "N", "C", "C", "C", "C", "C", "N", "N", "C", "N", "C", "C", "N", "C",
"C", "C", "C", "C", "C", "C", "C", "Cl", "C", "C", "C"},
{Bond[{1, 2}, "Single"], Bond[{2, 3}, "Single"], Bond[{3, 4}, "Single"], Bond[{4, 5}, ... 5, 26}, "Single"], Bond[{25, 27}, "Aromatic"],
Bond[{27, 28}, "Aromatic"], Bond[{13, 29}, "Single"], Bond[{7, 2}, "Single"],
Bond[{17, 8}, "Single"], Bond[{23, 18}, "Aromatic"], Bond[{14, 10}, "Aromatic"],
Bond[{28, 15}, "Aromatic"]}, {}], "AtomPairs", "CountVector"]//ShortGeneralizations & Extensions (3)
Use the "ExtendedConnectivity" method while specifying the radius used to define atom environments:
Total[MoleculeFingerprint[Molecule[{"C", "C", "O", "O", "C", "C", "C", "C", "C", "C", "O", Atom["C", "HydrogenCount" -> 1],
"C", "C", "C", "N", Atom["C", "HydrogenCount" -> 1], "C", "C", "O", "C", "C",
Atom["C", "HydrogenCount" -> 1], "F", "C", "C", "C", "C"},
{Bond ... "Direction" -> "Counterclockwise"], Association["StereoType" -> "Tetrahedral",
"ChiralCenter" -> 19, "Direction" -> "Counterclockwise"],
Association["StereoType" -> "Tetrahedral", "ChiralCenter" -> 23, "Direction" -> "Clockwise"]}}], {"ExtendedConnectivity", "Radius" -> #}]]& /@ {2, 4}Use the "AtomPairs" method, including or excluding atom chirality:
Total[MoleculeFingerprint[Molecule[{"O", "C", "O", Atom["C", "HydrogenCount" -> 1], "C", "C", "C", "C", "C", "C",
Atom["C", "HydrogenCount" -> 1], Atom["C", "HydrogenCount" -> 1], "C", "C", "C",
Atom["C", "HydrogenCount" -> 1], "C", Atom["C", "HydrogenCount" -> 1],
... Direction" -> "Counterclockwise"],
Association["StereoType" -> "Tetrahedral", "ChiralCenter" -> 22, "Direction" -> "Clockwise"],
Association["StereoType" -> "Tetrahedral", "ChiralCenter" -> 26,
"Direction" -> "Counterclockwise"]}}], {"AtomPairs", "IncludeChirality" -> #}]]& /@ {True, False}Find the "Topological" fingerprint, including or excluding hydrogen atoms:
Total[MoleculeFingerprint[Molecule[{"C", "C", Atom["N", "FormalCharge" -> 1], "C", "C", Atom["C", "HydrogenCount" -> 1], "N",
Atom["N", "FormalCharge" -> 1, "UnpairedElectronCount" -> 1], "C", "O", "C", "C", "C", "C", "C",
"C", "C", "C"}, {Bond[{1, 2}, "Single"], Bond ... atic"],
Bond[{17, 18}, "Aromatic"], Bond[{12, 3}, "Single"], Bond[{18, 13}, "Aromatic"],
Bond[{8, 4}, "Single"]}, {StereochemistryElements ->
{Association["StereoType" -> "Tetrahedral", "ChiralCenter" -> 6, "Direction" -> "Clockwise"]}}], {"Topological", IncludeHydrogens -> #}]]& /@ {True, False}Properties & Relations (1)
A molecule fingerprint can be returned in a number of formats:
({countV, bitV, numericArray, sparseArray, list} = MoleculeFingerprint[Molecule[{"C", "C", "O", "N", "C", "C", "C", "C", "C", "C", "O", "C", "O", "O"},
{Bond[{1, 2}, "Single"], Bond[{2, 3}, "Double"], Bond[{2, 4}, "Single"], Bond[{4, 5}, "Single"],
Bond[{5, 6}, "Aromatic"], Bond[{6, 7}, "Aromatic"], Bond[{7, 8}, "Aromatic"],
Bond[{8, 9}, "Aromatic"], Bond[{9, 10}, "Aromatic"], Bond[{9, 11}, "Single"],
Bond[{8, 12}, "Single"], Bond[{12, 13}, "Double"], Bond[{12, 14}, "Single"],
Bond[{10, 5}, "Aromatic"]}, {}], "Topological", #]& /@ {"CountVector", "BitVector", "NumericArray", "SparseArray", "List"})//ShallowThe "NumericArray", "SparseArray" and "List" formats give the same data:
SameQ@@Map[Normal, {numericArray, sparseArray, list}]But they differ in the space required to store the data:
Map[ByteCount, {numericArray, sparseArray, list}]The "BitVector" output type is the most space efficient:
ByteCount@bitVSince the "CountVector" output records not just substructure presence but frequency as well, it contains more information. In this example, it records 682 occurrences, while the numeric array only records 382:
Total /@ {countV, numericArray}Possible Issues (1)
Some fingerprint methods are not available as count vectors:
MoleculeFingerprint[Molecule[{Atom["O", "FormalCharge" -> -1], Atom["N", "FormalCharge" -> 1], "O", "C", "C", "C", "C",
"C", "C", "Cl", "C", "F", "F", "F"}, {Bond[{1, 2}, "Single"], Bond[{2, 3}, "Double"],
Bond[{2, 4}, "Single"], Bond[{4, 5}, "Aromatic"], Bond[{ ... ,
Bond[{6, 7}, "Aromatic"], Bond[{7, 8}, "Aromatic"], Bond[{8, 9}, "Aromatic"],
Bond[{9, 10}, "Single"], Bond[{6, 11}, "Single"], Bond[{11, 12}, "Single"],
Bond[{11, 13}, "Single"], Bond[{11, 14}, "Single"], Bond[{9, 4}, "Aromatic"]}, {}], "MACCSKeys", "CountVector"]Neat Examples (1)
Create a FeatureSpacePlot of pyridines using molecule fingerprints:
FeatureSpacePlot[EntityValue[EntityClass["Chemical", "Pyridines"], "Molecule"],
FeatureExtractor -> (MoleculeFingerprint[#, "Topological"]&), LabelingFunction -> (Placed[Dynamic[MoleculePlot[#1]], Tooltip]&)]History
Text
Wolfram Research (2026), MoleculeFingerprint, Wolfram Language function, https://reference.wolfram.com/language/ref/MoleculeFingerprint.html.
CMS
Wolfram Language. 2026. "MoleculeFingerprint." Wolfram Language & System Documentation Center. Wolfram Research. https://reference.wolfram.com/language/ref/MoleculeFingerprint.html.
APA
Wolfram Language. (2026). MoleculeFingerprint. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/MoleculeFingerprint.html
BibTeX
@misc{reference.wolfram_2026_moleculefingerprint, author="Wolfram Research", title="{MoleculeFingerprint}", year="2026", howpublished="\url{https://reference.wolfram.com/language/ref/MoleculeFingerprint.html}", note=[Accessed: 12-June-2026]}
BibLaTeX
@online{reference.wolfram_2026_moleculefingerprint, organization={Wolfram Research}, title={MoleculeFingerprint}, year={2026}, url={https://reference.wolfram.com/language/ref/MoleculeFingerprint.html}, note=[Accessed: 12-June-2026]}