weighted_sample_statistics package

Submodules

weighted_sample_statistics.core module

Definition of weighted_sample_statistics class to calculate weighted weighted_sample_statistics

class weighted_sample_statistics.core.WeightedSampleStatistics(group_keys: Iterable, records_df_selection: DataFrame, weights_df: DataFrame, column_list: Iterable | None = None, var_type: str | None = None, scaling_factor_key: str | None = None, units_scaling_factor_key: str | None = None, all_records_df: DataFrame | None = None, var_weight_key: str | None = None, variance_df_selection: DataFrame | None = None, records_df_unfilled: DataFrame | None = None, add_inverse: bool = False, report_numbers: bool = False, negation_suffix: str | None = None, start: bool = False)[source]

Bases: object

Calculate weighted_sample_statistics for summations

Parameters:
  • group_keys (iterable) – The variables to use to group

  • records_df_selection (DataFrame) – All the microdata including non-response

  • weights_df (DataFrame) – The weights per unit

  • all_records_df (DataFrame) – All the microdata including non-response

  • column_list (iterable) – list of columns to calculate weighted_sample_statistics

  • scaling_factor_key (str) – Name of the weight variable

  • var_type (str) – Type of the data

  • add_inverse (bool) – Add the negated value as well for booleans

  • report_numbers (bool) – Do not calculate the average, but the sum

records_sum

The summation of the weighted values

Type:

grouped

number_samples_sqrt

The square root of the sample size n

Type:

grouped

standard_error

The standard error of the mean estimate: std / n_sqrt

Type:

grouped

calculate() None[source]

Perform all calculations required for weighted sample statistics.

This method orchestrates the sequence of calculations necessary to determine weighted means, proportions, and standard errors. It also calculates the response fraction if all records are provided.

Return type:

None

calculate_proportions()[source]

Calculate proportions

calculate_response_fraction()[source]

Calculate response fraction

calculate_standard_errors()[source]

Calculate standard errors

calculate_weighted_means()[source]

Calculate summed weighted statistics for the selected columns.

This method calculates the weighted sums and means for the selected columns in the dataset. It normalizes weights, applies them to the records, and handles special cases such as empty selections and negation of values.

Return type:

None

group_variables()[source]

Group the variables according to the group keys.

This function groups the variables and the weights according to the specified group keys. The grouped variables are stored as attributes of the class for later use.

Return type:

None

scale_variables()[source]

Scale the variables with the scaling factor.

This function scales the variables by the scaling factor.

set_mask_valid_df()[source]

Set mask valid df

This function sets the mask for the valid records for the selected variables. This mask is used to select the valid records from the dataframe of the population.

Return type:

None

weighted_sample_statistics.core.make_negation_name(column_name: str, suffix: str = '_x') str[source]

Make a new column name for complementary values.

Returns:

negation_name

Return type:

str

weighted_sample_statistics.main module

This is a skeleton file that can serve as a starting point for a Python console script.

Besides console scripts, the header (i.e., until _logger…) of this file can also be used as a template for Python modules.

Note

This file can be renamed depending on your needs or safely removed if not needed.

References

weighted_sample_statistics.main.main(args)[source]

Wrapper function

Parameters:

args (List[str]) – command line parameters as a list of strings (for example, ["--verbose", "42"]).

weighted_sample_statistics.main.parse_args(args)[source]

Parse command line parameters

Parameters:

args (List[str]) – command line parameters as a list of strings (for example, ["--help"]).

Returns:

argparse.Namespace: command line parameters namespace

Return type:

obj

weighted_sample_statistics.main.run()[source]

Calls: func:main passing the CLI arguments extracted from: obj:sys.argv

This function can be used as an entry point to create console scripts with setuptools.

weighted_sample_statistics.main.setup_logging(loglevel)[source]

Setup basic logging

Parameters:

loglevel (int) – minimum loglevel for emitting messages

Module contents