weighted_sample_statistics package

Submodules

weighted_sample_statistics.core module

Definition of weighted_sample_statistics class to calculate weighted weighted_sample_statistics

class weighted_sample_statistics.core.WeightedSampleStatistics(group_keys: Iterable, records_df_selection: DataFrame, weights_df: DataFrame, column_list: Iterable | None = None, var_type: str | None = None, scaling_factor_key: str | None = None, units_scaling_factor_key: str | None = None, all_records_df: DataFrame | None = None, var_weight_key: str | None = None, variance_df_selection: DataFrame | None = None, records_df_unfilled: DataFrame | None = None, add_inverse: bool = False, report_numbers: bool = False, negation_suffix: str | None = None, start: bool = False)[source]

Bases: object

Calculate weighted_sample_statistics for summations

Parameters:

group_keys (iterable) – The variables to use to group
records_df_selection (DataFrame) – All the microdata including non-response
weights_df (DataFrame) – The weights per unit
all_records_df (DataFrame) – All the microdata including non-response
column_list (iterable) – list of columns to calculate weighted_sample_statistics
scaling_factor_key (str) – Name of the weight variable
var_type (str) – Type of the data
add_inverse (bool) – Add the negated value as well for booleans
report_numbers (bool) – Do not calculate the average, but the sum

records_sum

The summation of the weighted values

Type:: grouped

number_samples_sqrt

The square root of the sample size n

Type:: grouped

standard_error

The standard error of the mean estimate: std / n_sqrt

Type:: grouped

calculate() → None[source]

Perform all calculations required for weighted sample statistics.

This method orchestrates the sequence of calculations necessary to determine weighted means, proportions, and standard errors. It also calculates the response fraction if all records are provided.

Return type:: None

calculate_proportions()[source]: Calculate proportions

calculate_response_fraction()[source]: Calculate response fraction

calculate_standard_errors()[source]: Calculate standard errors

calculate_weighted_means()[source]

Calculate summed weighted statistics for the selected columns.

This method calculates the weighted sums and means for the selected columns in the dataset. It normalizes weights, applies them to the records, and handles special cases such as empty selections and negation of values.

Return type:: None

group_variables()[source]

Group the variables according to the group keys.

This function groups the variables and the weights according to the specified group keys. The grouped variables are stored as attributes of the class for later use.

Return type:: None

scale_variables()[source]

Scale the variables with the scaling factor.

This function scales the variables by the scaling factor.

set_mask_valid_df()[source]

Set mask valid df

This function sets the mask for the valid records for the selected variables. This mask is used to select the valid records from the dataframe of the population.

Return type:: None

weighted_sample_statistics.core.make_negation_name(column_name: str, suffix: str = '_x') → str[source]

Make a new column name for complementary values.

Returns:: negation_name
Return type:: str

weighted_sample_statistics.main module

This is a skeleton file that can serve as a starting point for a Python console script.

Besides console scripts, the header (i.e., until _logger…) of this file can also be used as a template for Python modules.

Note

This file can be renamed depending on your needs or safely removed if not needed.

References

weighted_sample_statistics.main.main(args)[source]

Wrapper function

Parameters:: args (List[str]) – command line parameters as a list of strings (for example, ["--verbose", "42"]).

weighted_sample_statistics.main.parse_args(args)[source]

Parse command line parameters

Parameters:: args (List[str]) – command line parameters as a list of strings (for example, ["--help"]).
Returns:: argparse.Namespace: command line parameters namespace
Return type:: obj

weighted_sample_statistics.main.run()[source]

Calls: func:main passing the CLI arguments extracted from: obj:sys.argv

This function can be used as an entry point to create console scripts with setuptools.

weighted_sample_statistics.main.setup_logging(loglevel)[source]

Setup basic logging

Parameters:: loglevel (int) – minimum loglevel for emitting messages

weighted_sample_statistics package

Submodules

weighted_sample_statistics.core module

weighted_sample_statistics.main module

Module contents