Background
In probabilistic analysis, the modern technique is to describe distributions by
parametrically-defined formulæ. Often, there are 2 or 3 parameters, specifying what are conveniently
referred to as location, scale and as shape, although
there may
be more or fewer.
The main advantages to this approach are:-
- Succinctness
- To convey a result, it is necessary to give only the name of the type of distribution plus
the values of the parameters: for example, "Gaussian distribution, mean= 33.1,
sd= 2.65".
- Detailed properties
- It is usually possible to differentiate and integrate the theoretical expressions to give other insights which may be of use,
such as skew.
There are, though, disadvantages. These include:
- Enforced Smoothness
- All the parametrically-defined formulæ used in practice give
smooth results(I am here using the term "smooth" in an informal way, not in the technical sense of having a continuous first differential).
For example, visualise the graph of an unimodal distribution; you will almost certainly be thinking of a bell-shaped
distribution since that is what you have become used to by using
parametrically-defined formulæ. Yet, compared to the set of unimodal distributions, the
set of bell-shaped distributions has measure zero; although you could
scarcely believe it from publications based on parametrically-defined
formulæ.
- Absence of a formula
- Enforced smoothness to one side, often there isn't a suitable formula in any case.
- Think of ranked distributions.
You can undoubtedly think of models for specific situations, such as the Poisson distribution or Zipf's Law, but
try giving a formula that gives all ranked distributions (and nothing more); there isn't one.
Because of matters such as these, there are times when no parametrically-defined distribution is applicable. When this happens, another approach is needed. The approach followed here is numerical generation.
This approach is made possible by one crucial result,
reported on this site: the discovery of a linear bijection from the set of all distributions (of a given degree) to that of all ranked distributions. All of the other distributions reported here start with that result.
It is the development of the algorithms needed by numerical generation, not only as a method of
calculation but also to define the distributions themselves, which I am
currently working on, and which this part of the site is about.
Operations on sets
Any set of distributions can be modified by the following