Logo

Workflow

  • DataSAILs General Workflow
  • Input formats
    • CSV and TSV Files
    • FASTA files
    • Pickle Files
    • HDF5 Files
    • Molecular Input Files
  • Clustering
    • Overview
    • Default Algorithms
    • Details about the clustering algorithms
  • Clustering of Embeddings
    • Individual Algorithms
  • Supported Solvers
    • Main Solvers
    • Additional Solvers
  • Splits
    • One-Dimensional Data
    • Two-Dimensional Data
    • Splitting Techniques

Interfaces

  • Commandline Interface
    • General Arguments
    • Splitting Arguments
    • Entity Arguments
  • Package
    • datasail
  • Evaluation of Data Leakage
    • eval_splits()

Examples

  • Other Initiatives
    • MoleculeNet
    • Leak Proof PDBBind (LP-PDBBind)
    • Protein Ligand INteraction Dataset and Evaluation Resource (PLINDER)
    • Protein INteraction Dataset and Evaluation Resource (PINDER)
    • Gold Standard Human Proteome Dataset for sequence-based PPI prediction
  • Split QM9 by SMILES
    • Load the Dataset
    • Run DataSAIL
    • The output
  • Split BACE by Weight
    • Load the Dataset
    • Run DataSAIL
    • The output
  • Split PDBBind in Two Dimensions
    • Load the Dataset
    • Preparation of Ligands
    • Preparation of Targets
    • Run DataSAIL
    • The output
  • Split an RNA dataset
    • Run DataSAIL
    • The output
  • Split Tox21 with Stratification
    • Load Tox21 Dataset
    • Run DataSAIL
    • The output
  • Split NASA Asteroids with DataSAIL
    • Load the NASA Asteroids dataset
    • Define the distance metric
    • Compute the distance matrix
    • Split the dataset
    • Investigate the splits
    • Train and test a Random Forest classifier

Extend DataSAIL

  • Contributing to DataSAIL
    • Contributing to the Documentation
    • Examples
  • How to Add a new Similarity or Distance Metric to DataSAIL
    • 0. Create a Fork of the DataSAIL Repository
    • 1. Installability
    • 2. Registration
    • 4. Registration – cont’d
    • 5. Tool Arguments
    • 7. Pull Request

Miscellaneous

  • Frequently Asked Questions
    • Theoretical and Conceptional Questions
    • Practical Questions
  • DataSAIL on Posters
    • GCB 2023 & RDKit UGM
    • HIPS Symposium and PhD Assembly @HIPS
DataSAIL
  • »
  • Index

Index

D | E | M

D

  • datasail
    • module
  • datasail() (in module datasail.sail)

E

  • eval_splits
    • module
  • eval_splits() (in module datasail.eval)

M

  • module
    • datasail
    • eval_splits

© Copyright 2025, Roman Joeres.

Built with Sphinx using a theme provided by Read the Docs.