Utility Functions

Fill in a module description here

Processing Data

Filterting Input Data:

We will need to be able to filter the input data to fit our testing needs. _filter_dataframe is a function to do this that takes in a pandas Dataframe and a set of filters.

/opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/statsforecast/utils.py:237: FutureWarning: 'M' is deprecated and will be removed in a future version, please use 'ME' instead.
  "ds": pd.date_range(start="1949-01-01", periods=len(AirPassengers), freq="M"),

source

_filter_dataframe

 _filter_dataframe (df, filters)

Filter a DataFrame using a dictionary or a list of dictionaries with multiple filter conditions.

Filter Examples: You can pass in a single value like {“State”:“Wisconsin”}. You can also pass in a list {“Cities”:[“La Crosse”,“Madison”,“Eau Claire”,“Milwaukee”]}

	Type	Details
df		A pandas DataFrame
filters		dictonary or list of dictionaries
Returns	DataFrame

Removing Dimensions with few Observations:

Check Names and Data Types

source

_name_type_check

 _name_type_check (df, dimension, date_col)

Check datatypes and names of columns

Process Metric Column:

source

_process_metric_col

 _process_metric_col (df, metric_col)

Putting Everthing together: `_process_data`

source

_process_data

 _process_data (path:str, dimension:str=None, date_col:str='ds',
                metric_col:Union[str,Callable]='y',
                filters:list[dict]=None, sz_threshold=50)

Filters and aggregates data

	Type	Default	Details
path	str		Path to Feather File
dimension	str	None	Independant Variable
date_col	str	ds	Date Column
metric_col	typing.Union[str, typing.Callable]	y	Dependent Variable
filters	list	None	Desired Filters
sz_threshold	int	50	Minimum number of observations

Processing Data

Filterting Input Data:

_filter_dataframe

Removing Dimensions with few Observations:

Check Names and Data Types

_name_type_check

Process Metric Column:

_process_metric_col

Putting Everthing together: _process_data

_process_data

Putting Everthing together: `_process_data`