Skip to content

Kedro integration functions

API reference for all Kedro integration functions. The how-to guide on Kedro Data Catalog integration contains more information.

vizro.integrations.kedro

catalog_from_project

catalog_from_project(project_path, env=None, extra_params=None)

Return the Kedro Data Catalog associated to a Kedro project.

Parameters:

  • project_path (Union[str, Path]) –

    Path to the Kedro project root directory.

  • env (Optional[str], default: None ) –

    Kedro configuration environment to be used. Defaults to "local".

  • extra_params (Optional[dict[str, Any]], default: None ) –

    Optional dictionary containing extra project parameters for underlying KedroContext. If specified, will update (and therefore take precedence over) the parameters retrieved from the project configuration.

Returns:

  • CatalogProtocol

    A Kedro Data Catalog.

Examples:

>>> from vizro.integrations import kedro as kedro_integration
>>> catalog = kedro_integration.catalog_from_project("/path/to/kedro/project")
Source code in src/vizro/integrations/kedro/_data_manager.py
def catalog_from_project(
    project_path: Union[str, Path], env: Optional[str] = None, extra_params: Optional[dict[str, Any]] = None
) -> CatalogProtocol:
    """Return the Kedro Data Catalog associated to a Kedro project.

    Args:
        project_path: Path to the Kedro project root directory.
        env: Kedro configuration environment to be used. Defaults to "local".
        extra_params: Optional dictionary containing extra project parameters
            for underlying KedroContext. If specified, will update (and therefore
            take precedence over) the parameters retrieved from the project
            configuration.

    Returns:
         A Kedro Data Catalog.

    Examples:
        >>> from vizro.integrations import kedro as kedro_integration
        >>> catalog = kedro_integration.catalog_from_project("/path/to/kedro/project")
    """
    bootstrap_project(project_path)
    with KedroSession.create(
        project_path=project_path, env=env, save_on_close=False, extra_params=extra_params
    ) as session:
        return session.load_context().catalog

datasets_from_catalog

datasets_from_catalog(catalog, *, pipeline=None)

Return the Kedro Dataset loading functions associated to a Kedro Data Catalog.

Parameters:

  • catalog (CatalogProtocol) –

    Path to the Kedro project root directory.

  • pipeline (Pipeline, default: None ) –

    Optional Kedro pipeline. If specified, the factory-based Kedro datasets it defines are returned.

Returns:

  • dict[str, pd_DataFrameCallable]

    A dictionary mapping dataset names to Kedro Dataset loading functions.

Examples:

>>> from vizro.integrations import kedro as kedro_integration
>>> dataset_loaders = kedro_integration.datasets_from_catalog(catalog)
Source code in src/vizro/integrations/kedro/_data_manager.py
def datasets_from_catalog(catalog: CatalogProtocol, *, pipeline: Pipeline = None) -> dict[str, pd_DataFrameCallable]:
    """Return the Kedro Dataset loading functions associated to a Kedro Data Catalog.

    Args:
        catalog: Path to the Kedro project root directory.
        pipeline: Optional Kedro pipeline. If specified, the factory-based Kedro datasets it defines are returned.

    Returns:
         A dictionary mapping dataset names to Kedro Dataset loading functions.

    Examples:
        >>> from vizro.integrations import kedro as kedro_integration
        >>> dataset_loaders = kedro_integration.datasets_from_catalog(catalog)
    """
    if parse(version("kedro")) < parse("0.19.9"):
        return _legacy_datasets_from_catalog(catalog)

    # This doesn't include things added to the catalog at run time but that is ok for our purposes.
    config_resolver = catalog.config_resolver
    kedro_datasets = config_resolver.config.copy()

    if pipeline:
        # Go through all dataset names that weren't in catalog and try to resolve them. Those that cannot be
        # resolved give an empty dictionary and are ignored.
        for dataset_name in set(pipeline.datasets()) - set(kedro_datasets):
            if dataset_config := config_resolver.resolve_pattern(dataset_name):
                kedro_datasets[dataset_name] = dataset_config

    def _catalog_release_load(dataset_name: str):
        # release is needed to clear the Kedro load version cache so that the dashboard always fetches the most recent
        # version rather than being stuck on the same version as when the app started.
        catalog.release(dataset_name)
        return catalog.load(dataset_name)

    vizro_data_sources = {}

    for dataset_name, dataset_config in kedro_datasets.items():
        # "type" key always exists because we filtered out patterns that resolve to empty dictionary above.
        if "pandas" in dataset_config["type"]:
            # We need to bind dataset_name=dataset_name early to avoid dataset_name late-binding to the last value in
            # the for loop.
            vizro_data_sources[dataset_name] = lambda dataset_name=dataset_name: _catalog_release_load(dataset_name)

    return vizro_data_sources

pipelines_from_project

pipelines_from_project(project_path)

Return the Kedro Pipelines associated to a Kedro project.

Parameters:

  • project_path (Union[str, Path]) –

    Path to the Kedro project root directory.

Returns:

  • dict[str, Pipeline]

    A dictionary mapping pipeline names to Kedro Pipelines.

Examples:

>>> from vizro.integrations import kedro as kedro_integration
>>> pipelines = kedro_integration.pipelines_from_project("/path/to/kedro/project")
Source code in src/vizro/integrations/kedro/_data_manager.py
def pipelines_from_project(project_path: Union[str, Path]) -> dict[str, Pipeline]:
    """Return the Kedro Pipelines associated to a Kedro project.

    Args:
        project_path: Path to the Kedro project root directory.

    Returns:
         A dictionary mapping pipeline names to Kedro Pipelines.

    Examples:
        >>> from vizro.integrations import kedro as kedro_integration
        >>> pipelines = kedro_integration.pipelines_from_project("/path/to/kedro/project")
    """
    bootstrap_project(project_path)
    from kedro.framework.project import pipelines

    return pipelines