hic3defdr.util.scaled_nb module¶

hic3defdr.util.scaled_nb.equalize(data, f, alpha)[source]¶

Given known scaling factors f and a known dispersion alpha, creates common-scale pseudodata from raw values data.

See https://rdrr.io/bioc/edgeR/src/R/equalizeLibSizes.R

Parameters:	data (np.ndarray) – Matrix of raw data to equalize. Rows are pixels, columns are replicates. f (np.ndarray) – Matrix of combined scaling factors for each pixel. alpha (float) – Single fixed dispersion to use during equalization.
Returns:	Matrix of equalized data.
Return type:	np.ndarray

hic3defdr.util.scaled_nb.fit_mu_hat(x, b, alpha, verbose=True)[source]¶

Numerical MLE fitter for the mean parameter of the scaled NB model under fixed dispersion. Vectorized.

See the following colab notebook for background and derivation: https://colab.research.google.com/drive/1SgMMvc3XhfIXoBx8tsyJt-yyBDlRFQCJ

Parameters:	x (np.ndarray) – The vector of observed counts. b (np.ndarray) – The vector of scaling factors, parallel to `x`. alpha (np.ndarray) – The vector of dispersions, parallel to `x`. verbose (bool) – Pass False to silence reporting of progress to stderr.
Returns:	The MLE of the mean parameter.
Return type:	float

Examples

>>> import numpy as np
>>> from hic3defdr.util.scaled_nb import fit_mu_hat

3 pixels, 2 reps (matrices): >>> x = np.array([[1, 2], … [3, 4], … [5, 6]]) >>> b = np.array([[0.9, 1.1], … [0.8, 1.2], … [0.7, 1.3]]) >>> alpha = np.array([[0.1, 0.2], … [0.3, 0.4], … [0.5, 0.6]]) >>> fit_mu_hat(x, b, alpha) array([1.47251127, 3.53879843, 5.86853465])

broadcast dispersion down the pixels: >>> fit_mu_hat(x, b, np.array([0.1, 0.2])) array([1.47251127, 3.53749833, 5.85554075])

broadcast dispersion across the reps: >>> fit_mu_hat(x, b, np.array([0.1, 0.2, 0.3])[:, None]) array([1.49544092, 3.51679438, 5.73129492])

1 pixel, two reps (vectors): >>> fit_mu_hat(np.array([1, 2]), np.array([0.9, 1.1]), np.array([0.1, 0.2])) array([1.47251127])

broadcast dispersion across reps: >>> fit_mu_hat(np.array([1, 2]), np.array([0.9, 1.1]), 0.1) array([1.49544092])

one pixel is fitted with newton, the second is fitted with brentq >>> x = np.array([[2, 3, 4, 2], … [6, 9, 3, 1]]) >>> b = np.array([[0.45, 0.53, 0.088, 0.091], … [0.70, 0.83, 0.14, 0.15 ]]) >>> alpha = np.array([[0.0071, 0.0071, 0.0073, 0.0073], … [0.0070, 0.0070, 0.0072, 0.0072]]) >>> fit_mu_hat(x, b, alpha) array([ 9.5900971 , 10.45962955])

hic3defdr.util.scaled_nb.inverse_mvr(mean, var)[source]¶

Inverse function of the negative binomial fixed-dispersion mean-variance relationship. Vectorized.

Parameters:	var (mean,) – The mean and variance of a NB distribution, respectively.
Returns:	The dispersion of that NB distribution.
Return type:	float

hic3defdr.util.scaled_nb.logpmf(k, m, phi)[source]¶

Log of the PMF of the negative binomial distribution, parameterized by its mean m and dispersion phi. Vectorized.

Parameters:	k (int) – The number of counts observed. m (float) – The mean parameter. phi (float) – The dispersion parameter.
Returns:	The log of the probability of observing `k` counts.
Return type:	float

hic3defdr.util.scaled_nb.mvr(mean, disp)[source]¶

Negative binomial fixed-dispersion mean-variance relationship. Vectorized.

Parameters:	disp (mean,) – The mean and dispersion of a NB distribution, respectively.
Returns:	The variance of that NB distribution.
Return type:	float

hic3defdr.util.scaled_nb.q2qnbinom(x, mu_in, mu_out, alpha)[source]¶

Converts values between two NB distributions with different means but the same dispersion.

See https://rdrr.io/bioc/edgeR/src/R/q2qnbinom.R

Parameters:	x (np.ndarray) – Vector of values to convert. mu_out (mu_in,) – Vectors of means to convert between. alpha (np.ndarray or float) – Single dispserion (to use for all `x`) or vector of dispersions (one per `x`) to hold constant during conversion.
Returns:	`x` converted from `mu_in` to `mu_out`.
Return type:	np.ndarray