API Reference¶
The 4 key modules¶
DataPreparer
QualityController
GaugeVsGriddedCorrelator
CEHGEARSubDailyProducer
- rainfall_gridder.DataPreparer(rainfall_data, rainfall_metadata, station_id_col, station_name_col, precipitation_col, date_time_col, start_date_col, end_date_col, easting_col, northing_col, gridded_rainfall_data, gridded_rainfall_col, rainfall_offset_hours, output_dir, min_n_timesteps, verbose=False)[source]¶
Main data preparing algorithm.
- Parameters:
rainfall_data (DataFrame)
rainfall_metadata (DataFrame)
station_id_col (str)
station_name_col (str)
precipitation_col (str)
date_time_col (str)
start_date_col (str)
end_date_col (str)
easting_col (str)
northing_col (str)
gridded_rainfall_data (Dataset)
gridded_rainfall_col (str)
rainfall_offset_hours (int)
output_dir (str | Path)
min_n_timesteps (int)
verbose (bool)
- rainfall_gridder.QualityController(rainfall_data, rainfall_metadata, station_id_col, station_name_col, date_time_col, precipitation_col, easting_col, northing_col, start_date_col, end_date_col, input_crs, min_n_timesteps, output_dir, time_res, smallest_rainfall_amount, min_n_neighbours, qc_framework, nearby_rainfall_data_loader_kwargs={}, verbose=False)[source]¶
Main quality control running algorithm.
- Parameters:
rainfall_data (DataFrame)
rainfall_metadata (DataFrame)
station_id_col (str)
station_name_col (str)
date_time_col (str)
precipitation_col (str)
easting_col (str)
northing_col (str)
start_date_col (str)
end_date_col (str)
input_crs (str)
min_n_timesteps (int)
output_dir (str | Path)
time_res (str)
smallest_rainfall_amount (int | float)
min_n_neighbours (int)
qc_framework (str)
nearby_rainfall_data_loader_kwargs (dict)
verbose (bool)
- rainfall_gridder.GaugeVsGriddedCorrelator(gauge_data, gauge_metadata, nearest_gridded_daily, station_id, precipitation_col, gridded_rainfall_col, date_time_col, start_date_col, end_date_col, station_id_col, easting_col, northing_col, rainfall_offset_hours, aggregate_gauge_to_daily=True)[source]¶
- Parameters:
gauge_data (DataFrame)
gauge_metadata (DataFrame)
nearest_gridded_daily (Dataset)
station_id (str)
precipitation_col (str)
gridded_rainfall_col (str)
date_time_col (str)
start_date_col (str)
end_date_col (str)
station_id_col (str)
easting_col (str)
northing_col (str)
rainfall_offset_hours (int)
aggregate_gauge_to_daily (bool)
- rainfall_gridder.CEHGEARSubDailyProducer(rainfall_data, rainfall_metadata, station_id_col, time_step, time_res, precipitation_col, easting_col, northing_col, date_time_col, hour_at_start_of_day, verbose)[source]¶
- Parameters:
rainfall_data (DataFrame)
rainfall_metadata (DataFrame)
station_id_col (str)
time_step (datetime)
time_res (str)
precipitation_col (str)
easting_col (str)
northing_col (str)
date_time_col (str)
hour_at_start_of_day (int)
verbose (bool)
Full API¶
Top-level package for RainfallGridder.
- class rainfall_gridder.CEHGEARSubDailyProducer(rainfall_data, rainfall_metadata, station_id_col, time_step, time_res, precipitation_col, easting_col, northing_col, date_time_col, hour_at_start_of_day, verbose)[source]¶
Bases:
objectMethods
calculate_distance_grid
get_cells_to_stat_disag
get_subdaily_rainfall_factors
produce_ceh_gear
run_interpolation
- Parameters:
rainfall_data (DataFrame)
rainfall_metadata (DataFrame)
station_id_col (str)
time_step (datetime)
time_res (str)
precipitation_col (str)
easting_col (str)
northing_col (str)
date_time_col (str)
hour_at_start_of_day (int)
verbose (bool)
- calculate_distance_grid(land_mask, gauge_x_grid=None, gauge_y_grid=None)[source]¶
- Return type:
DataArray- Parameters:
land_mask (DataArray)
gauge_x_grid (DataArray)
gauge_y_grid (DataArray)
- get_cells_to_stat_disag(land_mask, daily_totals_grid, distance_grid=None, max_distance_to_gauge_m=50000)[source]¶
- Return type:
DataArray- Parameters:
land_mask (DataArray)
daily_totals_grid (DataArray)
distance_grid (DataArray)
max_distance_to_gauge_m (int)
- get_subdaily_rainfall_factors(land_mask, daily_totals_grid, one_day_gridded_daily, cells_to_stat_disag, gridded_rainfall_col)[source]¶
- Return type:
Dataset- Parameters:
land_mask (DataArray)
daily_totals_grid (DataArray)
one_day_gridded_daily (Dataset)
cells_to_stat_disag (DataArray)
gridded_rainfall_col (str)
- class rainfall_gridder.DataPreparer(rainfall_data, rainfall_metadata, station_id_col, station_name_col, precipitation_col, date_time_col, start_date_col, end_date_col, easting_col, northing_col, gridded_rainfall_data, gridded_rainfall_col, rainfall_offset_hours, output_dir, min_n_timesteps, verbose=False)[source]¶
Bases:
objectMain data preparing algorithm.
Methods
run(save_data, return_data[, ...])Run the data preparer and return and/or save the prepared data.
save_prepared_data([partition_by_columns])Save data that has been prepared for gridding.
prepare_data_and_metadata_for_gridding
save_prepared_metadata
- Parameters:
rainfall_data (DataFrame)
rainfall_metadata (DataFrame)
station_id_col (str)
station_name_col (str)
precipitation_col (str)
date_time_col (str)
start_date_col (str)
end_date_col (str)
easting_col (str)
northing_col (str)
gridded_rainfall_data (Dataset)
gridded_rainfall_col (str)
rainfall_offset_hours (int)
output_dir (str | Path)
min_n_timesteps (int)
verbose (bool)
- classmethod run(save_data, return_data, partition_by_columns=None, **kwargs)[source]¶
Run the data preparer and return and/or save the prepared data.
- Parameters:
save_data: – Whether to save data to output directory
return_data: – Whether to return dataframes
partition_by_columns: – List of columns to partition the parquet files by if saving outputs
save_data (
bool)return_data (
bool)partition_by_columns (
list)
- Return type:
None|tuple[DataFrame,DataFrame]- Returns:
- :
- prepared_data:
Data run through algorithm
- prepared_metadata:
Metadata of data run through algorithm
- class rainfall_gridder.GaugeVsGriddedCorrelator(gauge_data, gauge_metadata, nearest_gridded_daily, station_id, precipitation_col, gridded_rainfall_col, date_time_col, start_date_col, end_date_col, station_id_col, easting_col, northing_col, rainfall_offset_hours, aggregate_gauge_to_daily=True)[source]¶
Bases:
objectMethods
get_corr
- Parameters:
gauge_data (DataFrame)
gauge_metadata (DataFrame)
nearest_gridded_daily (Dataset)
station_id (str)
precipitation_col (str)
gridded_rainfall_col (str)
date_time_col (str)
start_date_col (str)
end_date_col (str)
station_id_col (str)
easting_col (str)
northing_col (str)
rainfall_offset_hours (int)
aggregate_gauge_to_daily (bool)
- class rainfall_gridder.QualityController(rainfall_data, rainfall_metadata, station_id_col, station_name_col, date_time_col, precipitation_col, easting_col, northing_col, start_date_col, end_date_col, input_crs, min_n_timesteps, output_dir, time_res, smallest_rainfall_amount, min_n_neighbours, qc_framework, nearby_rainfall_data_loader_kwargs={}, verbose=False)[source]¶
Bases:
objectMain quality control running algorithm.
Methods
run(save_data, return_data[, ...])Run the quality controller and return and/or save the prepared data.
save_qcd_data([partition_by_columns])Save data that has been quality controlled for gridding.
Update all the shared keyword arguments.
get_nearest_neighbour
quality_control_data
save_qc_rulebase_summary
save_qcd_metadata
save_summary_of_qc
- Parameters:
rainfall_data (DataFrame)
rainfall_metadata (DataFrame)
station_id_col (str)
station_name_col (str)
date_time_col (str)
precipitation_col (str)
easting_col (str)
northing_col (str)
start_date_col (str)
end_date_col (str)
input_crs (str)
min_n_timesteps (int)
output_dir (str | Path)
time_res (str)
smallest_rainfall_amount (int | float)
min_n_neighbours (int)
qc_framework (str)
nearby_rainfall_data_loader_kwargs (dict)
verbose (bool)
- classmethod run(save_data, return_data, partition_by_columns=None, **kwargs)[source]¶
Run the quality controller and return and/or save the prepared data.
- Parameters:
save_data: – Whether to save data to output directory
return_data: – Whether to return dataframes
partition_by_columns: – List of columns to partition the parquet files by if saving outputs
save_data (
bool)return_data (
bool)partition_by_columns (
list)
- Return type:
None|tuple[DataFrame,DataFrame]- Returns:
- :
- qc_data:
Data run through algorithm
- qc_metadata:
Metadata of data run through algorithm
- save_qcd_data(partition_by_columns=None)[source]¶
Save data that has been quality controlled for gridding.
- Parameters:
partition_by_columns: – Columns that decide the partitioning of the output parquet file structure (default is station_id_col)
partition_by_columns (
list)
- Return type:
None
Update all the shared keyword arguments.
TODO: Check this updating in the loop properly.
- Return type:
None- Parameters:
nearby_rainfall_data_loader (NearbyRainfallDataLoader)