processor

Fill in a module description here

source

create_audit_record

 create_audit_record (batch_id:str, source_id:str, total_questions:int,
                      processing_time_seconds:float,
                      llm_config:Dict[str,Any],
                      summary_stats:Dict[str,Any])

*Create an audit record for the extraction batch.

Args: batch_id: Unique identifier for the batch source_id: Identifier for the source document total_questions: Number of questions processed processing_time_seconds: Time taken for processing llm_config: Configuration used for LLM calls summary_stats: Summary statistics from create_summary_stats

Returns: Audit record dictionary*


source

create_summary_stats

 create_summary_stats
                       (results:List[llm_data_extractor.models.ExtractionR
                       esult])

*Create summary statistics for a batch of extraction results.

Args: results: List of ExtractionResult objects

Returns: Dictionary with summary statistics*


source

format_for_target_tables

 format_for_target_tables
                           (results:List[llm_data_extractor.models.Extract
                           ionResult], questions_map:Dict[str,Any])

*Format results grouped by target table for direct insertion into business tables.

Args: results: List of ExtractionResult objects questions_map: Map of question_id to Question objects

Returns: Dictionary with table names as keys and records as values*


source

format_for_db

 format_for_db (results:List[llm_data_extractor.models.ExtractionResult],
                source_id:Optional[str]=None, batch_id:Optional[str]=None)

*Format extraction results for database insertion.

Args: results: List of ExtractionResult objects source_id: Optional identifier for the source document/text batch_id: Optional identifier for the processing batch

Returns: List of dictionaries ready for database insertion*


source

process_query

 process_query (query:str, llm_config:llm_data_extractor.models.LLMConfig,
                db_config:llm_data_extractor.models.DBConfig,
                results_table_name:str, max_workers:int=4)

*Fetches data from Snowflake, processes it in parallel using an LLM, and inserts results back.

Args: query: SQL query to fetch the data to be processed. llm_config: Configuration for the LLM. db_config: Configuration for the database connection. results_table_name: Name of the table to store the results. max_workers: The maximum number of threads to use for parallel processing.

Returns: A dictionary containing summary statistics of the processing run.*