random_variables
random_variables ¶
Random Variable Implementations for Process Analysis.
This module provides a framework for working with random variables and probability distributions in a type-safe, numerically stable way. It includes implementations of common distributions (Normal, Uniform, Categorical) and infrastructure for managing collections of random variables.
The module uses a metaclass-based approach to ensure consistent handling of array dimensions across all distributions, making it easy to work with both scalar and vector-valued random variables.
Key Features
- Type-safe implementations of common probability distributions
- Automatic dimension handling via the squeezable decorator
- Support for reproducible sampling via seed management
- Collections for managing groups of random variables
- Serialization support via pydantic
CategoricalDistribution ¶
Bases: RandomVariable[T]
Categorical distribution for discrete outcomes with specified probabilities.
A categorical distribution (also called a discrete distribution) describes the probability of obtaining one of k possible outcomes. Each outcome has a probability between 0 and 1, and all probabilities must sum to 1.
If probabilities are not specified, defaults to equal probabilities for all categories (uniform discrete distribution).
Key Properties
- Support is finite set of categories
- PMF gives probability of each category
- CDF is step function
Attributes:
Name | Type | Description |
---|---|---|
categories |
ndarray
|
Array of possible outcomes (any type) |
probabilities |
ndarray
|
Probability for each category |
name |
str
|
Identifier for this distribution instance |
seed |
Optional[int]
|
Random seed for reproducible sampling |
replace |
bool
|
Whether or not to allow multiple draws of the same value (allowed if True) |
Raises:
Type | Description |
---|---|
ValueError
|
If probabilities don't sum to 1 |
ValueError
|
If lengths of categories and probabilities don't match |
ValueError
|
If any probability is negative |
cdf ¶
Evaluate the cumulative distribution function.
For categorical distributions, this is a step function that increases at each category by that category's probability.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
ndarray
|
Points at which to evaluate the CDF |
required |
Other Parameters:
Name | Type | Description |
---|---|---|
squeeze |
bool
|
Whether to remove unnecessary dimensions.
Added by RandomVariableMeta. Defaults is |
Returns:
Type | Description |
---|---|
NDArray[Any, float]
|
NDArray[Any,float]: CDF values at the input points |
Source code in src/data_handlers/random_variables.py
pdf ¶
Evaluate the probability mass function (PMF).
For categorical distributions, this gives the probability of each category occurring.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
ndarray
|
Points at which to evaluate the PMF |
required |
Other Parameters:
Name | Type | Description |
---|---|---|
squeeze |
bool
|
Whether to remove unnecessary dimensions.
Added by RandomVariableMeta. Defaults is |
Returns:
Type | Description |
---|---|
NDArray[Any, float]
|
NDArray[Any,float]: Probability of each input value |
Source code in src/data_handlers/random_variables.py
sample ¶
Generate random samples from the categorical distribution.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
size
|
int
|
Number of samples to generate. Defaults to 1. |
1
|
Other Parameters:
Name | Type | Description |
---|---|---|
squeeze |
bool
|
Whether to remove unnecessary dimensions.
Added by RandomVariableMeta. Defaults is |
Notes
The squeeze parameter is added automatically by the metaclass and does not appear in the function signature, but can be passed as a keyword argument.
Returns:
Type | Description |
---|---|
NDArray[Any, T]
|
NDArray[Any,T]: Array of samples from the categories |
Source code in src/data_handlers/random_variables.py
validate_and_set_probabilities ¶
validate_and_set_probabilities() -> CategoricalDistribution
Validate probability values and set defaults if needed.
Source code in src/data_handlers/random_variables.py
NormalDistribution ¶
Bases: RandomVariable[float]
Normal (Gaussian) distribution with mean μ and standard deviation σ.
The normal distribution is a continuous probability distribution that is symmetric about its mean, showing the familiar bell-shaped curve.
Key Properties
- Symmetric about the mean
- ~68% of values lie within 1σ of μ
- ~95% lie within 2σ of μ
- ~99.7% lie within 3σ of μ
Attributes:
Name | Type | Description |
---|---|---|
mu |
float
|
Mean (μ) of the distribution |
sigma |
float
|
Standard deviation (σ) of the distribution |
name |
str
|
Identifier for this distribution instance |
seed |
Optional[int]
|
Random seed for reproducible sampling |
cdf ¶
Evaluate the normal cumulative distribution function.
The CDF is computed using the error function: F(x) = 1/2 * (1 + erf((x-μ)/(σ√2)))
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
ndarray
|
Points at which to evaluate the CDF |
required |
Other Parameters:
Name | Type | Description |
---|---|---|
squeeze |
bool
|
Whether to remove unnecessary dimensions.
Added by RandomVariableMeta. Defaults is |
Returns:
Type | Description |
---|---|
NDArray[Any, float]
|
NDArray[Any,float]: CDF values at the input points |
Source code in src/data_handlers/random_variables.py
pdf ¶
Evaluate the normal probability density function.
The PDF is given by: f(x) = 1/(σ√(2π)) * exp(-(x-μ)²/(2σ²))
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
ndarray
|
Points at which to evaluate the PDF |
required |
Other Parameters:
Name | Type | Description |
---|---|---|
squeeze |
bool
|
Whether to remove unnecessary dimensions.
Added by RandomVariableMeta. Defaults is |
Returns:
Type | Description |
---|---|
NDArray[Any, float]
|
NDArray[Any,float]: PDF values at the input points |
Source code in src/data_handlers/random_variables.py
sample ¶
Generate random samples from the normal distribution.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
size
|
int | tuple[int, ...]
|
Number or shape of samples |
1
|
Other Parameters:
Name | Type | Description |
---|---|---|
squeeze |
bool
|
Whether to remove unnecessary dimensions.
Added by RandomVariableMeta. Defaults is |
Returns:
Type | Description |
---|---|
NDArray[Any, float]
|
NDArray[Any,float]: Array of samples from N(μ, σ) |
Source code in src/data_handlers/random_variables.py
RandomVariable ¶
Bases: NamedObject
, Generic[T]
Base class for random variables.
This class provides a common interface for random variable implementations. Subclasses must implement sample(), pdf(), and cdf() methods. The metaclass ensures these methods support dimension control via the squeeze parameter.
The class is generic over the type of values it produces (T), which must be a subtype of SerializableValue to ensure proper serialization behavior.
Attributes:
Name | Type | Description |
---|---|---|
_registry_category |
str
|
Category name for object registration |
seed |
Optional[int]
|
Random seed for reproducible sampling |
name |
str
|
Identifier for this random variable instance |
Type Variables
T: The type of values produced by this random variable
cdf ¶
Evaluate cumulative distribution function at specified points.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
ndarray
|
Points at which to evaluate the CDF |
required |
Other Parameters:
Name | Type | Description |
---|---|---|
squeeze |
bool
|
Whether to remove unnecessary dimensions.
Added by RandomVariableMeta. Defaults is |
Returns:
Type | Description |
---|---|
NDArray[Any, T]
|
CDF values at the input points |
Raises:
Type | Description |
---|---|
NotImplementedError
|
Must be implemented by subclasses |
Source code in src/data_handlers/random_variables.py
pdf ¶
Evaluate probability density function at specified points.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
ndarray
|
Points at which to evaluate the PDF |
required |
Other Parameters:
Name | Type | Description |
---|---|---|
squeeze |
bool
|
Whether to remove unnecessary dimensions.
Added by RandomVariableMeta. Defaults is |
Returns:
Type | Description |
---|---|
NDArray[Any, T]
|
PDF values at the input points |
Raises:
Type | Description |
---|---|
NotImplementedError
|
Must be implemented by subclasses |
Source code in src/data_handlers/random_variables.py
register_to_hash ¶
register_to_hash(
var_hash: RandomVariableHash, size: int = 1, sample: bool = True, squeeze: bool = True
) -> NamedValue[T | NDArray[Any, T]]
Register this random variable to a hash and return sampled values.
This is a convenience method for adding a random variable to a collection and immediately sampling from it.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
var_hash
|
RandomVariableHash
|
Hash object to register to |
required |
size
|
int
|
Number of samples to generate |
1
|
Returns:
Type | Description |
---|---|
NamedValue[T | NDArray[Any, T]]
|
Named value containing samples |
Source code in src/data_handlers/random_variables.py
sample ¶
Generate random samples from the categorical distribution.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
size
|
int
|
Number of samples to generate. Defaults to 1. |
1
|
Other Parameters:
Name | Type | Description |
---|---|---|
squeeze |
bool
|
Whether to remove unnecessary dimensions.
Added by RandomVariableMeta. Defaults is |
Returns:
Type | Description |
---|---|
NDArray[Any, T]
|
Array of samples from the categories |
Raises:
Type | Description |
---|---|
NotImplementedError
|
Must be implemented by subclasses |
Source code in src/data_handlers/random_variables.py
RandomVariableHash ¶
Bases: NamedObjectHash
Collection of random variables.
This class manages a collection of RandomVariable objects, providing methods to register, retrieve and sample from multiple distributions.
Attributes:
Name | Type | Description |
---|---|---|
_registry_category |
str
|
Category name for object registration |
get_variables ¶
get_variables() -> Iterable[RandomVariable]
Get all registered random variables.
Returns:
Type | Description |
---|---|
Iterable[RandomVariable]
|
Iterable[RandomVariable]: Iterator over all registered variables |
register_variable ¶
register_variable(
var: RandomVariable, size: int = 1, sample: bool = True, squeeze: bool = True
) -> NamedValue[SerializableValue | NDArray[Any, SerializableValue]]
Register a random variable and return its samples wrapped in a NamedValue.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
var
|
RandomVariable
|
Random variable to register |
required |
size
|
int
|
Number of samples to generate |
1
|
Returns:
Type | Description |
---|---|
NamedValue[SerializableValue | NDArray[Any, SerializableValue]]
|
NamedValue[SerializableValue|NDArray[Any,SerializableValue]]: Named value containing samples |
Raises:
Type | Description |
---|---|
ValueError
|
If a variable with the same name already exists |
Source code in src/data_handlers/random_variables.py
sample_all ¶
Sample from all registered distributions.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
size
|
int
|
Number of samples to generate per distribution |
1
|
Returns:
Type | Description |
---|---|
dict[str, ndarray]
|
dict[str, np.ndarray]: Dictionary mapping variable names to their samples |
Source code in src/data_handlers/random_variables.py
RandomVariableList ¶
Bases: NamedObjectList
List of random variables.
This class manages an ordered list of RandomVariable objects, providing methods to add, access, and sample from multiple distributions while maintaining their order.
Attributes:
Name | Type | Description |
---|---|---|
_registry_category |
str
|
Category name for object registration |
objects |
List[SerializeAsAny[InstanceOf[RandomVariable]]]
|
List of random variable objects |
__getitem__ ¶
__getitem__(idx: int) -> RandomVariable
Get a random variable by its index in the list.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
idx
|
int
|
Index of the random variable to retrieve |
required |
Returns:
Name | Type | Description |
---|---|---|
RandomVariable |
RandomVariable
|
The random variable at the specified index |
Raises:
Type | Description |
---|---|
IndexError
|
If the index is out of range |
Source code in src/data_handlers/random_variables.py
append ¶
append(variable: RandomVariable) -> Self
Append a random variable to the end of the list.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
variable
|
RandomVariable
|
Random variable to append |
required |
Returns:
Name | Type | Description |
---|---|---|
Self |
Self
|
The RandomVariableList instance for method chaining |
Source code in src/data_handlers/random_variables.py
extend ¶
extend(variables: Iterable[RandomVariable]) -> Self
Extend the list with multiple random variables.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
variables
|
Iterable[RandomVariable]
|
Collection of random variables to add |
required |
Returns:
Name | Type | Description |
---|---|---|
Self |
Self
|
The RandomVariableList instance for method chaining |
Source code in src/data_handlers/random_variables.py
get_variable ¶
get_variable(name: str) -> RandomVariable
Get a registered random variable by name.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
Name of the random variable |
required |
Returns:
Name | Type | Description |
---|---|---|
RandomVariable |
RandomVariable
|
The requested random variable |
Raises:
Type | Description |
---|---|
KeyError
|
If no variable exists with the given name |
Source code in src/data_handlers/random_variables.py
get_variables ¶
get_variables() -> Iterable[RandomVariable]
Get all registered random variables.
Returns:
Type | Description |
---|---|
Iterable[RandomVariable]
|
Iterable[RandomVariable]: Iterator over all registered variables |
register_variable ¶
register_variable(var: RandomVariable) -> Self
Register a random variable to the collection.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
var
|
RandomVariable
|
Random variable to register |
required |
Returns:
Name | Type | Description |
---|---|---|
Self |
Self
|
The RandomVariableList instance |
Source code in src/data_handlers/random_variables.py
sample_all ¶
Sample from all variables in the list.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
size
|
int
|
Number of samples to generate per variable |
1
|
Returns:
Type | Description |
---|---|
dict[str, ndarray]
|
dict[str, np.ndarray]: Dictionary mapping variable names to their samples |
Source code in src/data_handlers/random_variables.py
RandomVariableMeta ¶
Metaclass for random variable implementations that automatically adds array handling functionality.
This metaclass inherits from pydantic's model metaclass to maintain compatibility with the BaseModel validation system while adding automatic array handling capabilities to all random variable implementations.
Key Features
- Automatically applies the
squeezable
decorator to sample(), pdf(), and cdf() methods - Maintains compatibility with pydantic's model validation system
- Ensures consistent array handling across all random variable implementations
Example
Technical Details
- Inherits from type(BaseModel) to maintain pydantic compatibility
- Uses new to modify class attributes during class creation
- Preserves method signatures while adding the squeeze parameter
- Ensures proper type hints and docstring updates
Notes
- The squeezable decorator adds a
squeeze
parameter to wrapped methods - When squeeze=True (default), output arrays are squeezed and 0-d arrays are converted to scalar values
- Original method behavior is preserved when squeeze=False
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mcs
|
The metaclass instance |
required | |
name
|
str
|
Name of the class being created |
required |
bases
|
tuple
|
Base classes |
required |
namespace
|
dict
|
Class namespace dictionary |
required |
**kwargs
|
Additional keyword arguments passed to type(BaseModel) |
required |
Returns:
Type | Description |
---|---|
type
|
The created class with enhanced array handling capabilities |
UniformDistribution ¶
Bases: RandomVariable[float]
Continuous uniform distribution over an interval [low, high].
The uniform distribution describes equal probability over a continuous interval. Any value between low and high is equally likely to be drawn.
Key Properties
- Mean = (low + high)/2
- Variance = (high - low)²/12
- Constant PDF over [low, high]
- Linear CDF over [low, high]
Attributes:
Name | Type | Description |
---|---|---|
low |
float
|
Lower bound of the interval |
high |
float
|
Upper bound of the interval |
name |
str
|
Identifier for this distribution instance |
seed |
Optional[int]
|
Random seed for reproducible sampling |
cdf ¶
Evaluate the uniform cumulative distribution function.
The CDF is: - 0 for x < low - (x-low)/(high-low) for low ≤ x ≤ high - 1 for x > high
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
ndarray
|
Points at which to evaluate the CDF |
required |
Other Parameters:
Name | Type | Description |
---|---|---|
squeeze |
bool
|
Whether to remove unnecessary dimensions.
Added by RandomVariableMeta. Defaults is |
Returns:
Type | Description |
---|---|
NDArray[Any, float]
|
NDArray[Any,float]: CDF values at the input points |
Source code in src/data_handlers/random_variables.py
pdf ¶
Evaluate the uniform probability density function.
The PDF is 1/(high-low) for x in [low,high] and 0 elsewhere.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
ndarray
|
Points at which to evaluate the PDF |
required |
Other Parameters:
Name | Type | Description |
---|---|---|
squeeze |
bool
|
Whether to remove unnecessary dimensions.
Added by RandomVariableMeta. Defaults is |
Returns:
Type | Description |
---|---|
NDArray[Any, float]
|
NDArray[Any,float]: PDF values at the input points |
Source code in src/data_handlers/random_variables.py
sample ¶
Generate random samples from the uniform distribution.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
size
|
int
|
Number of samples to generate |
1
|
Other Parameters:
Name | Type | Description |
---|---|---|
squeeze |
bool
|
Whether to remove unnecessary dimensions.
Added by RandomVariableMeta. Defaults is |
Returns:
Type | Description |
---|---|
NDArray[Any, float]
|
NDArray[Any,float]: Array of samples from U(low,high) |
Source code in src/data_handlers/random_variables.py
validate_bounds ¶
validate_bounds() -> UniformDistribution
Validate that high > low.
Source code in src/data_handlers/random_variables.py
squeezable ¶
Decorator that makes a function's output array squeezable
via an added keyword argument squeeze
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
func
|
Callable[P, R]
|
The function to be decorated. |
required |
squeeze_by_default
|
bool
|
Whether or not to squeeze by default |
False
|
Returns:
Type | Description |
---|---|
Callable[P, R]
|
A new function that
squeezes the output of |