dstk.workflows package#

Submodules#

dstk.workflows.stage_workflows module#

This module provides classes and factory functions to build and manage complex, multi-stage workflows composed of sequential method executions across different processing modules. It supports workflow validation against predefined templates, method chaining with type enforcement, and flexible execution control, including partial or complete result retrieval.

Key components include:

  • Factory functions like TextProcessing and PlotEmbeddings to easily instantiate common workflows with predefined templates and modules.

Designed to facilitate modular, extensible, and maintainable workflow construction for tasks such as text processing and embedding visualization.

dstk.workflows.stage_workflows.PlotEmbeddings(name: str, workflows: dict[str, list[dict[str, dict[str, Any]]]]) StageWorkflowBuilder[source]#

Creates a StageWorkflowBuilder configured for word embedding plotting workflows. The modules included are ‘data_visualization.clustering’ in the first stage and ‘data_visualization.embeddings’ in the second.

Parameters:
  • name (str) – The name of the workflow instance.

  • workflows (StageWorkflow) – A StageWorkflow dictionary defining the workflow steps per module/stage.

Returns:

An instance of StageWorkflowBuilder configured with embedding plotting templates and modules.

Return type:

StageWorkflowBuilder

dstk.workflows.stage_workflows.TextProcessing(name: str, workflows: dict[str, list[dict[str, dict[str, Any]]]]) StageWorkflowBuilder[source]#

Creates a StageWorkflowBuilder configured for text processing workflows. The modules included are ‘tokenizer’ in the first stage and ‘text_processor’ or ‘ngrams’ in the second.

Parameters:
  • name (str) – The name of the workflow instance.

  • workflows (StageWorkflow) – A StageWorkflow dictionary defining the workflow steps per module/stage.

Returns:

An instance of StageWorkflowBuilder configured with text processing templates and modules.

Return type:

StageWorkflowBuilder

dstk.workflows.workflow_tools module#

This module provides classes for defining, validating, and executing complex workflows composed of multiple processing steps and stages. It supports dynamic method invocation from specified modules, workflow validation against templates with type and step rules, and optional method wrapping for object-oriented usage.

Key components:

  • Wrapper: Simple container for input data, enabling method injection.

  • WorkflowBuilder: Automates sequential execution of methods in a single workflow, including validation and optional wrapping.

  • StageWorkflowBuilder: Manages multiple workflows organized in stages and modules, enforcing stage/module constraints and chaining workflows.

This module is designed to facilitate building flexible, validated processing workflows with dynamic and modular behavior.

class dstk.workflows.workflow_tools.StageWorkflowBuilder(templates: dict[str, WorkflowTemplate], stage_modules: dict[int, set[str]], name: str, workflows: dict[str, list[dict[str, dict[str, Any]]]])[source]#

Bases: object

Manages and runs workflows composed of multiple stages and modules.

Allows sequential execution of workflows associated with various modules/stages, validating and chaining them according to provided templates and configurations.

Parameters:
  • templates (StageTemplate) – A mapping of module names to their workflow templates.

  • stage_modules (StageModules) – A mapping of stage indices to allowed module names.

  • name (str) – Name of the stage workflow builder instance.

  • workflows (StageWorkflow) – A mapping of module names to their workflows.

class dstk.workflows.workflow_tools.WorkflowBuilder(name: str, module_name: str, workflow: list[dict[str, dict[str, Any]]], template: WorkflowTemplate | None = None, wrapper: bool = False)[source]#

Bases: object

Automates the execution of a sequence of methods as a workflow.

This class dynamically imports and executes a chain of methods defined in a workflow, optionally validates the workflow against a template, and can wrap methods for object-oriented style usage.

Parameters:
  • name (str) – Name of the workflow instance.

  • module_name (str) – Name of the module containing the methods to be executed.

  • workflow (Workflow) – A workflow definition, a list of dicts mapping method names to kwargs.

  • template (WorkflowTemplate or None) – Optional workflow template for validation and typing rules. Defaults to None

  • wrapper – If True, creates a Wrapper instance allowing method calls as object methods with internal data injection. Defaults to False.

Usage:

CustomWorkflow = WorkflowBuilder(...)
result = CustomWorkflow(input_data)
class dstk.workflows.workflow_tools.Wrapper(input_data: Any)[source]#

Bases: object

Module contents#