hic3defdr.util.scaled_nb module

hic3defdr.util.scaled_nb.equalize(data, f, alpha)[source]

Given known scaling factors f and a known dispersion alpha, creates common-scale pseudodata from raw values data.

See https://rdrr.io/bioc/edgeR/src/R/equalizeLibSizes.R

Parameters:
  • data (np.ndarray) – Matrix of raw data to equalize. Rows are pixels, columns are replicates.
  • f (np.ndarray) – Matrix of combined scaling factors for each pixel.
  • alpha (float) – Single fixed dispersion to use during equalization.
Returns:

Matrix of equalized data.

Return type:

np.ndarray

hic3defdr.util.scaled_nb.fit_mu_hat(x, b, alpha, verbose=True)[source]

Numerical MLE fitter for the mean parameter of the scaled NB model under fixed dispersion. Vectorized.

See the following colab notebook for background and derivation: https://colab.research.google.com/drive/1SgMMvc3XhfIXoBx8tsyJt-yyBDlRFQCJ

Parameters:
  • x (np.ndarray) – The vector of observed counts.
  • b (np.ndarray) – The vector of scaling factors, parallel to x.
  • alpha (np.ndarray) – The vector of dispersions, parallel to x.
  • verbose (bool) – Pass False to silence reporting of progress to stderr.
Returns:

The MLE of the mean parameter.

Return type:

float

Examples

>>> import numpy as np
>>> from hic3defdr.util.scaled_nb import fit_mu_hat

3 pixels, 2 reps (matrices): >>> x = np.array([[1, 2], … [3, 4], … [5, 6]]) >>> b = np.array([[0.9, 1.1], … [0.8, 1.2], … [0.7, 1.3]]) >>> alpha = np.array([[0.1, 0.2], … [0.3, 0.4], … [0.5, 0.6]]) >>> fit_mu_hat(x, b, alpha) array([1.47251127, 3.53879843, 5.86853465])

broadcast dispersion down the pixels: >>> fit_mu_hat(x, b, np.array([0.1, 0.2])) array([1.47251127, 3.53749833, 5.85554075])

broadcast dispersion across the reps: >>> fit_mu_hat(x, b, np.array([0.1, 0.2, 0.3])[:, None]) array([1.49544092, 3.51679438, 5.73129492])

1 pixel, two reps (vectors): >>> fit_mu_hat(np.array([1, 2]), np.array([0.9, 1.1]), np.array([0.1, 0.2])) array([1.47251127])

broadcast dispersion across reps: >>> fit_mu_hat(np.array([1, 2]), np.array([0.9, 1.1]), 0.1) array([1.49544092])

one pixel is fitted with newton, the second is fitted with brentq >>> x = np.array([[2, 3, 4, 2], … [6, 9, 3, 1]]) >>> b = np.array([[0.45, 0.53, 0.088, 0.091], … [0.70, 0.83, 0.14, 0.15 ]]) >>> alpha = np.array([[0.0071, 0.0071, 0.0073, 0.0073], … [0.0070, 0.0070, 0.0072, 0.0072]]) >>> fit_mu_hat(x, b, alpha) array([ 9.5900971 , 10.45962955])

hic3defdr.util.scaled_nb.inverse_mvr(mean, var)[source]

Inverse function of the negative binomial fixed-dispersion mean-variance relationship. Vectorized.

Parameters:var (mean,) – The mean and variance of a NB distribution, respectively.
Returns:The dispersion of that NB distribution.
Return type:float
hic3defdr.util.scaled_nb.logpmf(k, m, phi)[source]

Log of the PMF of the negative binomial distribution, parameterized by its mean m and dispersion phi. Vectorized.

Parameters:
  • k (int) – The number of counts observed.
  • m (float) – The mean parameter.
  • phi (float) – The dispersion parameter.
Returns:

The log of the probability of observing k counts.

Return type:

float

hic3defdr.util.scaled_nb.mvr(mean, disp)[source]

Negative binomial fixed-dispersion mean-variance relationship. Vectorized.

Parameters:disp (mean,) – The mean and dispersion of a NB distribution, respectively.
Returns:The variance of that NB distribution.
Return type:float
hic3defdr.util.scaled_nb.q2qnbinom(x, mu_in, mu_out, alpha)[source]

Converts values between two NB distributions with different means but the same dispersion.

See https://rdrr.io/bioc/edgeR/src/R/q2qnbinom.R

Parameters:
  • x (np.ndarray) – Vector of values to convert.
  • mu_out (mu_in,) – Vectors of means to convert between.
  • alpha (np.ndarray or float) – Single dispserion (to use for all x) or vector of dispersions (one per x) to hold constant during conversion.
Returns:

x converted from mu_in to mu_out.

Return type:

np.ndarray