Significance Test module#

significance_test.clean_up_asset_name(asset_name: str) str#

Remove suffixes and map to cleaned names.

This function removes the _no_htl and _random extension from the asset. For example: AutoFilter_Chen_Like_no_htl would be transformed into AutoFilter_Chen_Like.

Args:

asset_name: The name of the asset to be (potentially) changed

Returns:

The cleaned asset name

significance_test.collect_asset_paths(ASSET_PATHS: Path | None = None) DefaultDict[str, List]#

Collect asset directory paths from the given base asset path.

Args:

ASSET_PATHS (Optional[Path]): The base path where assets are stored. Defaults to computed path.

Returns:

DefaultDict[str, List]: A dictionary mapping task names to lists of directory paths containing assets.

significance_test.filter_no_htl(asset_name: str) bool#

This is a helper function for the create_comparison_df() function. It is used to compare No HTL with HTL, so the filter condition looks for assets that contain _no_htl in their name.

Args:

asset_name (str): The asset to check for.

Returns:

bool: Returns True if the asset ends with _no_htl, else defaults to False.

significance_test.filter_random(asset_name: str) bool#

This is a helper function for the create_comparison_df() function. It is used to compare Random (Filled Up) with HTL, so the filter condition looks for assets that contain _random in their name.

Args:

asset_name (str): The asset to check for.

Returns:

bool: Returns True if the asset ends with _random, else defaults to False.

significance_test.helper_function(significance_test_data: Dict[str, DataFrame], filter_condition: Callable[[str], bool]) Dict[str, Dict[str, ndarray]]#

Process and filter significance test data based on the given condition.

Args:

significance_test_data (Dict[str, pd.DataFrame]): The significance test data. filter_condition (Callable[[str], bool]): A function to filter asset names.

Returns:

Dict[str, Dict[str, np.ndarray]]: A dictionary mapping task names to filtered asset data.

significance_test.load_asset_data(workspace_data: DefaultDict[str, List[Path]]) Dict[str, DataFrame]#

Load asset data from the given workspace directory paths.

Args:

workspace_data (DefaultDict[str, List[Path]]): A dictionary mapping task names to lists of asset directory paths.

Returns:

Dict[str, pd.DataFrame]: A dictionary mapping task names to concatenated DataFrames containing asset data.

significance_test.main()#
significance_test.no_htl_vs_htl()#
significance_test.prepare_significance_test_data(filter_condition: Callable[[str], bool]) Dict[str, DataFrame]#

Prepare data for significance testing.

Args:

filter_condition (Callable[[str], bool]): The filtering condition function.

Returns:

Dict[str, pd.DataFrame]: Processed data ready for significance testing.

significance_test.random_vs_htl()#
significance_test.signifance_test(comparison_data: Dict[str, DataFrame], file_name: str)#

Perform significance testing and save results.

Args:

comparison_data (Dict[str, pd.DataFrame]): Data for the comparison. file_name (str): The output file name for results.

significance_test.visualize_results()#