Utility Functions
Fill in a module description here
Processing Data
Filterting Input Data:
We will need to be able to filter the input data to fit our testing needs. _filter_dataframe
is a function to do this that takes in a pandas Dataframe and a set of filters.
/opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/statsforecast/utils.py:237: FutureWarning: 'M' is deprecated and will be removed in a future version, please use 'ME' instead.
"ds": pd.date_range(start="1949-01-01", periods=len(AirPassengers), freq="M"),
_filter_dataframe
_filter_dataframe (df, filters)
Filter a DataFrame using a dictionary or a list of dictionaries with multiple filter conditions.
Filter Examples: You can pass in a single value like {“State”:“Wisconsin”}. You can also pass in a list {“Cities”:[“La Crosse”,“Madison”,“Eau Claire”,“Milwaukee”]}
Type | Details | |
---|---|---|
df | A pandas DataFrame | |
filters | dictonary or list of dictionaries | |
Returns | DataFrame |
Removing Dimensions with few Observations:
Check Names and Data Types
_name_type_check
_name_type_check (df, dimension, date_col)
Check datatypes and names of columns
Process Metric Column:
_process_metric_col
_process_metric_col (df, metric_col)
Putting Everthing together: _process_data
_process_data
_process_data (path:str, dimension:str=None, date_col:str='ds', metric_col:Union[str,Callable]='y', filters:list[dict]=None, sz_threshold=50)
Filters and aggregates data
Type | Default | Details | |
---|---|---|---|
path | str | Path to Feather File | |
dimension | str | None | Independant Variable |
date_col | str | ds | Date Column |
metric_col | typing.Union[str, typing.Callable] | y | Dependent Variable |
filters | list | None | Desired Filters |
sz_threshold | int | 50 | Minimum number of observations |