Histogramming ============= Several options for histograms are provided. Full-sample histogram --------------------- :meth:`nbi_stat.histogram` is similar to ``numpy.histogram`` in that it requires the full sample to be available. However, :meth:`nbi_stat.histogram` differ from ``numpy.histgram`` in some important aspects * We *always* divide the counts by the bin widths so that the *area* is proportional to the probability. * The function allows for both *frequency* and *non-frequency* weights - see :ref:`freq_weights` - for all kinds of binning * The function always returns the uncertainty on each bin, including for the case of non-frequency weights * Rather than returning the bin boundaries, the function returns the bin centres and widths. This allow for easy drawing using ``matplotlib.pyplot.errorbar`` Let's show an example .. plot:: :include-source: >>> import nbi_stat >>> np.random.seed(123456) >>> x = np.random.normal(0,1,size=1000) >>> h, x, w, e = nbi_stat.histogram(x,30) >>> plt.errorbar(x,h,e,w/2,".",label='Data') >>> plt.legend() On-line histogramming --------------------- An "online" method for building histograms is likewise provided. This again allows for weights - both *frequency* and *non-frequency* weights. Three functions are defined for this purpose * :meth:`nbi_stat.init_histogram` to set up the histogram structure. If the option ``weighted`` is given, then the histogram structure contains storage that allows for weights. * :meth:`nbi_stat.fill_histogram` to fill individual observations into the histogram. * :meth:`nbi_stat.fini_histogram` for final calculations on the histogram (i.e., dividing by bin widths and optional normalisation). Let's do an example similar to the above .. plot:: :include-source: >>> import nbi_stat >>> np.random.seed(123456) >>> hist = nbi_stat.init_histogram(np.linspace(-3,3,31)) >>> for _ in range(1000): >>> hist = nbi_stat.fill_histogram(np.random.normal(),*hist) >>> h, x, w, e = nbi_stat.fini_histogram(*hist) >>> plt.errorbar(x,h,e,w/2,".",label='Data') >>> plt.legend() A histogram object ------------------ If one prefers a *Object Oriented* approach one can use objects of the class :class:`nbi_stat.Histogram`. This class defines methods to initialize, fill, and finalize a histogram, as well methods to access the data. .. plot:: :include-source: >>> import nbi_stat >>> np.random.seed(123456) >>> >>> hist = nbi_stat.Histogram(np.linspace(-3,3,31)) >>> >>> for _ in range(1000): >>> hist.fill(np.random.normal()) >>> >>> hist.finalize() >>> s = plt.plot(hist.centers,hist.heights, ... drawstyle='steps-mid',alpha=.5) >>> plt.errorbar(hist.centers, ... hist.heights, ... hist.uncertainties, ... hist.widths/2, ... '.', color=s[0].get_color(),label='Histogram') >>> plt.legend() We can use weights when filling the histogram object. These weights can be either *frequency* or *non-frequency* weights, depending on the value of the ``weighted`` argument to the constructor. If this is ``False`` (default), then we assume *frequency* weights, while if ``True`` we assume *non-frequency* weights. Let's do a frequency weighted histogram first .. plot:: :include-source: >>> import nbi_stat >>> np.random.seed(123456) >>> >>> hist = nbi_stat.Histogram(np.linspace(-3,3,31)) >>> >>> for x in hist.centers: >>> hist.fill(x,np.random.poisson(3)) >>> >>> hist.finalize() >>> >>> s = plt.plot(hist.centers,hist.heights, ... drawstyle='steps-mid',alpha=.5) >>> plt.errorbar(hist.centers, ... hist.heights, ... hist.uncertainties, ... hist.widths/2, ... '.', color=s[0].get_color(), ... label='Frequency weights') >>> plt.legend() And now one with *non-frequency* weights .. plot:: :include-source: >>> import nbi_stat >>> np.random.seed(123456) >>> >>> hist = nbi_stat.Histogram(np.linspace(-3,3,31),True) >>> >>> for _ in range(1000): >>> hist.fill(np.random.normal(),np.random.random()) >>> >>> hist.finalize() >>> >>> plt.fill_between(hist.centers, ... np.zeros_like(hist.heights), ... hist.heights, ... step='mid',alpha=.5) >>> plt.errorbar(hist.centers, ... hist.heights, ... hist.uncertainties, ... hist.widths/2, ... '.', label='Non-frequency weighted') >>> plt.plot(hist.centers, ... hist.sums.sum()/np.sqrt(2*np.pi)* ... np.exp(-.5*hist.centers**2),label='Normal') >>> plt.legend()