Cagen

Case testing and generation tools

This code and write-up dates to October 23, 2020. As I prepared another project, I found this draft untouched and have since decided to release it.

Over the past few weeks I’ve engrossed myself in coding challenges: participating and building tools for participants and hosts alike. Initially developed for the Tech With Tim Discord challenges, I’ve packaged all of it into the repository, Cagen.

Cagen includes Challenge, a fast and detailed testing unit for Python that can be augmented with minimal effort, challenge-specific generators (which are largely built ad hoc and of little use to a third party), and gentools, a library of tools to supplement case generators.

Challenge

The testing unit offers high-precision metrics, control, and speed. Among the many are

  • Runtime and iteration rate measures with precision up to one thousandth of a second,

  • Error handling and tracking

  • Rule violation tracking

  • Time limit enforcement

  • Failed case recording

The user’s end is simple.

  1. Paste the solution function into solution.py and define the function as solution.

  2. Run the main script

  3. Enter the id of the test you would like to take

More information about Challenge can be found here.

gentools

There is only one publicly available tool at the moment of this post, which is binning.

gentools.wrap.binning

The binning strategy solves a problem in number sampling from a very large range.

We usually represent number ranges as $$x \in [a,b]$$ but we can also use a range of digits $$Z \in [{d_a, d_b}]$$ $$d(Z)=10^{Z}-1$$ $$X \in [d(m), d(M) ]$$

The problem comes as this range grows. The sample becomes greatly biased towards numbers of M-1 length. We can demonstrate this bias with a simple proof.

$$b \in [m, M]$$ $$P(X<10^{b})=\frac{10^{b} - 10^{m}}{10^{M}-10^{m}}$$

Establish the lower bound as some constant.

$$c=10^m$$ $$P(X<10^{b})=\frac{10^{b}-c}{10^{M}-c}$$

$$\lim_{M \to +\infty}P(X<10^{b})=\lim_{M \to +\infty}\frac{10^{b}}{10^{M}}$$ $$=\lim_{M \to +\infty}\frac{1}{10^{M-b}}$$

Take this case

$$b=M-1$$ $$\lim_{M \to +\infty}P(X<10^{M-1})=\frac{1}{10}$$

As M gets large, and m is held constant, all numbers with less than M-1 digits will have a likelihood of selection approaching 0.1.

In order to attenuate to this problem, we can use the binning approach. Binning entails breaking the sample into k sub-samples of equal size that select from different exponent ranges, merging them after to create a metasample.

The likelihood a value comes from some bin's exponent range is proportional to the number of bins.

$$P(b)=\frac{1}{k}$$

This solution relegates the problem from the overall sample to each bin, but it still provides a far more equitable distribution of case values.

Cumulative distribution function for a naive generator

Cumulative distribution function for a naive generator

Cumulative density functions for each bin

Cumulative density functions for each bin

Cumulative density function for a 10-bin sample

Cumulative density function for a 10-bin sample

Using Cagen’s gentools.wrap module

The basic implementation

Iterable mapping

More Plots

Probability density functions for naive and binned generators from 1 to 10^18

Probability density functions for naive and binned generators from 1 to 10^18

Runtime performance of binning wrapper vs. naive generator as sample size increases.

Runtime performance of binning wrapper vs. naive generator as sample size increases.

Previous
Previous

The Hannah Montana Problem

Next
Next

Gaming Virtual Economies