Generator Combinators¶
Generator combinators are provided for building a new generator based on existing ones. Many of them come from ideas and best practices of functional programming. They can be chained as they receive existing generator(s) as argument and returns new generator.
While you can go through this document from top to the bottom, you might be want to find a suitable combinator for your use case using this table:
Purpose | Related Generator/Combinator | Examples |
---|---|---|
Generate just a constant | just<T> |
0 or "1337" |
Generate a value within constants | elementOf<T> |
a prime number under 100 |
Generate a list of unique values | Arbi<set<T>> |
{3,5,1} but not {3,5,5} |
Generate a value within numeric range of values | interval<T> , integers<T> |
a number within 1 ~9999 |
Generate a pair or a tuple of different types | pairOf<T1,T2> , tupleOf<Ts...> |
a pair<int, string> |
Union multiple generators | unionOf<T> (oneOf<T> ) |
20~39 or 60~79 combined |
Transform into another type or a value | transform<T,U> |
"0" or "1.4" (a number as string). |
Generate a struct or a class object | construct<T,ARGS...> |
a Rectangle object with width and height |
Apply constraints in generated values | filter (suchThat ) |
an even natural number (n % 2 == 0 ) |
Generate values with dependencies or relationships | dependency , chain , pairWith , tupleWith |
a rectangle where width == height * 2 |
Generate a value based on previously generated value | aggregate , accumulate |
a sequence of numbers where each one is between 0.5x and 1.5x of its previous |
Basic Generator Combinators¶
Constants¶
just<T>(T*)
,just<T>(T)
,just<T>(shared_ptr<T>)
: always generates a specific value. A shared pointer can be used for non-copyable types.lazy<T>(std::function<T()>)
: generates a value by calling a functionauto zeroGen = just(0); // template argument is optional if type is deducible auto oneGen = lazy<int>([]() { return 1; });
Selecting from constants¶
You may want to random choose from specific list of values.
-
elementOf<T>(val1, ..., valN)
: generates a typeT
from multiple values for typeT
, by choosing one of the values randomly// generates a prime number under 50 auto primeGen = elementOf<int>(2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47);
elementOf
can receive optional probabilitistic weights (0 < weight < 1
, sum of weights must not exceed 1.0) for generators. If weight is unspecified for a generator, it is calculated automatically so that remaining probability among unspecified generators is evenly distributed.weightedVal(<value>, <weight>)
is used to annotate the desired weight.
// generates a numeric within ranges [0,10], [100, 1000], [10000, 100000] // weight for 10 automatically becomes 1.0 - 0.8 - 0.15 == 0.05 elementOf<int>(weightedVal(2, 0.8), weightedVal(5, 0.15), 10);
Integers and intervals¶
Some utility generators for integers are provided
interval<INT_TYPE>(min, max)
: generates an integer type(e.g.uint16_t
) in the closed interval[min, max]
.integers<INT_TYPE(from, count)
: generates an integer type starting fromfrom
interval<int64_t>(1, 28); interval(1, 48); // template type argument can be ommitted if the input type(`int`) is the same as the output type. interval(1L, 48L); // template type argument can be ommitted if the input type(`int`) is the same as the output type. interval(0, 10); // generates an integer in {0, ..., 10} interval('A', 'Z'); // generates a char of uppercase alphabet integers(0, 10); // generates an integer in {0, ..., 9} integers(1, 10); // generates an integer in {1, ..., 10}
natural<INT_TYPE>(max)
: generates a positive integer up tomax
(inclusive)nonNegative<INT_TYPE>(max)
: : generates zero or a positive integer up tomax
(inclusive)
Pair and Tuples¶
Generators for different types can be bound to a pair or a tuple.
-
pairOf<T1, T2>(gen1, gen2)
: generates astd::pair<T1,T2>
based on result of generatorsgen1
andgen2
auto pairGen = pairOf(Arbi<int>(), Arbi<std::string>());
-
tupleOf<T1, ..., TN>(gen1, ..., genN)
: generates astd::tuple<T1,...,TN>
based on result of generatorsgen1
throughgenN
auto tupleGen = tupleOf(Arbi<int>(), Arbi<std::string>(), Arbi<double>());
Advanced Generator Combinators¶
Selecting from generators¶
You can combine generators to a single generator that can generate each of them with some probability. This can be considered as taking a union of generators.
-
oneOf<T>(gen1, ..., genN)
: generates a typeT
from multiple generators for typeT
, by choosing one of the generators randomly// generates a numeric within ranges [0,10], [100, 1000], [10000, 100000] auto evenGen = oneOf<int>(interval(0, 10), interval(100, 1000), interval(10000, 100000));
oneOf
can receive optional probabilistic weights (0 < weight < 1
, sum of weights must not exceed 1.0) for generators. If weight is unspecified for a generator, it is calculated automatically so that remaining probability among unspecified generators is evenly distributed.weightedGen(<generator>, <weight>)
is used to annotate the desired weight.
// generates a numeric within ranges [0,10], [100, 1000], [10000, 100000] auto evenGen = oneOf<int>(weightedGen(interval(0, 10), 0.8), weightedGen(interval(100, 1000), 0.15), interval(10000, 100000)/* weight automatically becomes 1.0 - (0.8 + 0.15) == 0.05 */);
-
unionOf<T>
is an alias ofoneOf<T>
Constructing an object¶
You can generate an object of a class or a struct type T
, by calling a matching constructor of T
.
-
construct<T, ARG1, ..., ARGN>([gen1, ..., genM])
: generates an object of typeT
by calling its constructor that matches the signature(ARG1, ..., ARGN)
. Custom generatorsgen1
,...,genM
can be supplied for generating arguments. IfM < N
, then rest of the arguments are generated withArbi
s.struct Coordinate { Coordinate(int x, int y) { // ... } }; // ... auto coordinateGen1 = construct<Coordinate, int, int>(interval(-10, 10), interval(-20, 20)); auto coordinateGen2 = construct<Coordinate, int, int>(interval(-10, 10)); // y is generated with Arbi<int>
Applying constraints¶
You can add a filtering condition to a generator to restrict the generated values to have certain constraint.
-
filter<T>(gen, condition_predicate)
: generates a typeT
that satisfies condition predicate (condition_predicate
returnstrue
)// generates even numbers auto evenGen = filter<int>(Arbi<int>(),[](int& num) { return num % 2 == 0; });
-
suchThat<T>
: an alias offilter
Transforming or mapping¶
You can transform an existing generator to create new generator by providing a transformer function. This is equivalent to mapping in functional programming context.
-
transform<T,U>(gen, transformer)
: generates typeU
based on generator for typeT
, usingtransformer
that transforms a value of typeT
to typeU
// generates string from integers (e.g. "0", "1", ... , "-16384") auto numStringGen = transform<int, std::string>(Arbi<int>(),[](int& num) { return std::string(num); });
Deriving or flat-mapping¶
Another combinator that resembles transform
is derive
. This is equivalent to flat-mapping or binding in functional programming. Difference to transform<T,U>
is that you can have greater control on the resultant generator.
-
derive<T, U>(genT, genUGen)
: derives a new generator for typeU
, based on result ofgenT
, which is a generator for typeT
.// generates a string something like "KOPZZFASF", "ghnpqpojv", or "49681002378", ... that consists of only uppercase/lowercase alphabets/numeric characters. auto stringGen = derive<int, std::string>(integers(0, 2), [](int& num) { if(num == 0) return Arbi<std::string>(interval('A', 'Z')); else if(num == 1) return Arbi<std::string>(interval('a', 'z')); else // num == 2 return Arbi<std::string>(interval('0', '9')); });
Following table compares transform
and derive
:
Combinator | transformer signature | Result type |
---|---|---|
transform<T,U> |
function<U(T)> |
Generator<U> |
derive<T,U> |
function<Generator<U>(T)> |
Generator<U> |
Values with dependencies¶
You may want to include dependency in the generated values. There are two variants that do this. One generates a pair and the other one generates a tuple.
-
dependency<T,U>(genT, genUgen)
: generates astd::pair<T,U>
with a generatorgenT
for typeT
andgenUgen
.genUgen
receives a typeT
and returns a generator for typeU
. This can effectively create a generator for a pair where second item depends on the first one.auto sizeAndVectorGen = dependency<int, std::vector<bool>>(Arbi<bool>(), [](int& num) { auto vectorGen = Arbi<std::vector<int>>(); vectorGen.maxLen = num; // generates a vector with maximum size of num return vectorGen; }); auto nullableIntegers = dependency<bool, int>(Arbi<bool>(), [](bool& isNull) { if(isNull) return just<int>(0); else return fromTo<int>(10, 20); });
-
chain<Ts..., U>(genT, genUgen)
: similar todependency
, but takes a tuple generator forstd::tuple<Ts...>
and generates astd::tuple<Ts..., U>
instead of astd::pair
.chain
can be repeatedly applied to itself, and results in a tuple one element larger than the previous one. You can chain multiple dependencies with this form.auto yearMonthGen = tupleOf(fromTo(0, 9999), fromTo(1,12)); // number of days of month depends on month (28~31 days) and year (whether it's a leap year) auto yearMonthDayGen = chain<std::tuple<int, int>, int>(yearMonthGen, [](std::tuple<int,int>& yearMonth) { int year = std::get<0>(yearMonth); int month = std::get<1>(yearMonth); if(monthHas31Days(month)) { return fromTo(1, 31); } else if(monthHas30Days(month)) { return fromTo(1, 30); } else { // february has 28 or 29 days if(isLeapYear(year)) return fromTo(1, 29); else return fromTo(1, 28); } }); // yearMonthDayGen generates std::tuple<int, int, int> of (year, month, day)
Actually you can achieve the similar goal using filter
combinator:
// generate any year,month,day combination
auto yearMonthDayGen = tupleOf(fromTo(0, 9999), fromTo(1,12), fromTo(1,31));
// apply filter
auto validYearMonthDayGen = yearMonthDayGen.filter([](std::tuple<int,int,int>& ymd) {
int year = std::get<0>(ymd);
int month = std::get<1>(ymd);
int day = std::get<2>(ymd);
if(monthHas31Days(month) && day <= 31)
return true;
else if(monthHas30Days(month) && day <= 30)
return true;
else { // february has 28 or 29 days
if(isLeapYear(year) && day <= 29)
return true;
else
return day <= 28;
}
});
However, using filter
for generating values with complex dependency may result in many generated values that do not meet the constraint to be discarded and retried. Therefore it's usually not recommended for that purpose if the ratio of discarded values is high.
Aggregation or Accumulation of Values¶
You may want to generate values that are related to previously generated values. This can be achieved with aggregate
or accumulate
.
Both of the combinators take base generator in the form of Generator<T>
as the first argument and a factory that takes a value of type T
and returns Generator<T>
, as the second argument.
While aggregate
generates a single value, accumulate generates a list of values at each generation.
Combinator | Result type | Remark |
---|---|---|
aggregate<GenT, GenT2GenT>(genT, gen2GenT, minSize, maxSize) |
Generator<T> |
|
accumulate<GenT, GenT2GenT>(genT, gen2GenT, minSize, maxSize) |
Generator<list<T>> |
// generate initial value
auto baseGen = interval(0, 1000);
// generate a value based on previous value
auto gen = aggregate(
gen1,
[](int& num) {
return interval(num/2, num*2);
},
2 /* min size */, 10 /* max size */);
// generate initial value
auto baseGen = interval(0, 1000);
// generate list of values
auto gen = accumulate(
gen1,
[](int& num) {
return interval(num/2, num*2);
},
2 /* min size */, 10 /* max size */);
Utility Methods in Standard Generators¶
Standard generators and combinators (including Arbi<T>
and Construct<...>
) returns a Generator<T>
, which is of the form (Random&) -> Shrinkable<T>
(aliased as GenFunction<T>
), but has additional combinator methods decorated for ease of use. They in fact have equivalent standalone counterparts. Following table shows this relationship:
Decorated method | Result type | Equivalent Standalone combinator |
---|---|---|
Generator<T>::filter |
Generator<T> |
filter<T> |
Generator<T>::map<U> |
Generator<U> |
transform<T,U> |
Generator<T>::flatMap<U> |
Generator<U> |
derive<T,U> |
Generator<T>::pairWith<U> |
Generator<std::pair<T,U>> |
dependency<T,U> |
Generator<T>::tupleWith<U> |
Generator<std::tuple<T,U>> |
chain<T,U> |
Generator<std::tuple<Ts...>>::tupleWith<U> |
Generator<std::tuple<Ts...,U>> |
chain<std::tuple<Ts...>,U> |
These functions and methods can be continuously chained.
-
.map<U>(mapper)
: effectively callstransform<T,U>(gen, transformer)
combinator on itself with typeT
and generatorgen
.// generator for strings of arbitrary number Arbi<int>().map<std::string>([](int &num) { return std::to_string(num); }); // this is equivalent to: transform<int, std::string>(Arbi<int>(), [](int &num) { return std::to_string(num); });
-
.filter(filterer)
: applyfilter
combinator on itself.// two equivalent ways to generate random even numbers auto evenGen = Arbi<int>().filter([](int& num) { return num % 2 == 0; }); auto evenGen = filter<int>(Arbi<int>(),[](int& num) { return num % 2 == 0; });
-
.flatMap<U>(genUGen)
: based on generated result of the generator object itself, induces a new generator for typeU
. It's equivalent combinator isderive<T,U>
. Difference to.map<U>
(ortransform<T,U>
) is that you can have greater control on the resultant generator.auto stringGen = Arbi<int>().flatMap<std::string>([](int& num) { auto genString = Arbi<std::string>(); genString.setMaxSize(num); return genString; });
-
.pairWith<U>(genUGen)
ortupleWith<U>(genUGen)
: chains itself to create a generator of pair or tuple. Equivalent todependency
orchain
, respectively.Arbi<bool>().tupleWith<int>([](bool& isEven) { if(isEven) return Arbi<int>().filter([](int& value) { return value % 2 == 0; }); else return Arbi<int>().filter([](int& value) { return value % 2 == 1; }); }).tupleWith<std::string>([](std::tuple<bool, int>& tuple) { int size = std::get<1>(tuple); auto stringGen = Arbi<std::string>(); stringGen.setSize(size); return stringGen; });
Notice
tupleWith
can automatically chain a tuple generator ofn
parameters into a tuple generator ofn+1
parameters (bool
generator ->tuple<bool, int>
generator ->tuple<bool, int, string>
generator in above example)