support performance testing #15

piccolbo · 2015-02-03T01:52:57Z

No description provided.

piccolbo · 2015-02-23T18:44:38Z

So far we have introduced a level of visibility for performance. I think the next steps would be

express that performance in units that make more sense or are more portable from machine to machine
expand the API with a possibility of failure on insufficient performance

These are quite advanced features and I don't know of any precedent right now. As such, we are not going to try and solve this in the next release or so, and an attempt will be made in a dedicated branch.

piccolbo · 2015-04-02T22:01:46Z

Actually the performance measurement has been hidden as it was confusing.

piccolbo · 2015-04-02T22:47:30Z

Done some experiments with sorting. Observations include

could use performance modeling to use peculiar R features to do something that has not been done in other languages. That is run function to measure on a variety of inputs sizes and fit a model. Model could be user-provided but usually low degree polylog should do, but need to think about regularization or model will always fit highest poly.
outliers probably due to concurrent activities on the same system are a problem for model fitting. robust fitting a partial solution
experiments were aimed at modeling elapsed time as a function of input size, which is hardware dependent. It would be interesting to model performance on target function as a function of performance on reference task, like cpu, memory and disk bound tasks. T(C(S), M(S), D(S)). C, M and D functions would be fitted for each specific system, then we'd have expectations on the coefficients of T, independent of machine
need to extend ideas to multiple arguments. What should we do with interactions? Is it reasonable to provide a default model?

piccolbo · 2015-04-28T21:20:00Z

Trying to detail the model a bit more. If we want to measure function performance as a function of platform performance (how fast the test machine is) and want to characterize this over multiple dimensions, we need to have multiple observable or the system is undetermined. system.time comes to the rescue as it offers three different measurements: user, system and elapsed time. So we can think of function performance as a multivariate polylog of input size multiplied by the unit time cost of elementary operations such as numeric, memory and disk operations, So the number of operations can be, say, quadratic, but we model the difference between platforms as linear.

T ~ polylog(S) x C

We don't have C so we run a number of benchmark expressions and collect the timings (again, user, system and elapsed). We model this similarly with

B = M x C

Where M describes how many of the elementary operations are needed for each of the benchmarks and C is platform dependent.
If we invert this and right-multiply the former equation we obtain

T B^-1 ~ polylog(S) x M^-1

Where unobservable platform-dependent matrix C cancels out. This is equivalent to

T B^-1  ~ polylog(S)

That is, if the above reasoning is correct, we can make the timings portable by inverting (pseudo-inverse) the benchmark matrix, which is measurable, and left multiplying it to the test timings. This is a linear transformation so we can model it with a poly, polylog or other, user-defined model.

piccolbo · 2015-04-28T22:45:49Z

To clarify, quickcheck goal is not to model performance data. The goal here is to create portable performance related assertions, which ultimately end up in tests. But we may need to provide some modeling tools to help people write such assertions, because of portability.

piccolbo · 2015-04-28T22:51:54Z

The plan is as follows: users won't model directly their algo performance T but T B^-1 or other portable transformation. They will make their model coefficients as a vector in their tests. The test will use the model to predict T B^-1 for specific input sizes and compute B for a specific test machine. Then will compare predicted and actual T and when prediction are exceeded by a TBD amount, the test will fail.

piccolbo · 2015-04-28T22:57:19Z

It may be worth considering whether performance is only function of input size or of the input value itself. If one thinks of textbook algorithms such as sort, input size is all that matters in most cases. But in the case, for instance, of RNG, the input size is generally constant, but run time dependent on the actual sample size (an argument), so the general approach would be to have a performance model to be dependent on the actual arguments of a test, and then have the modeler decide whether to consider the length of an input or some other function of it.

piccolbo · 2015-05-01T21:10:12Z

Trivial example to bring this back to earth. To test the performance of an implementation of quicksort, qsort() I can write a test like

test(forall(x = ratomic(), {n = length(x); expect("time.limit", C*n*log( n))}))

Where C is a constant that depends on the test machine hw sw platform, probably more hw in this case. Now if we just made tha C a function call C() that returns the right number for each machine, we can write a portable performance test. Such function doesn't exist yet and in any event it would not be a scalar function as machine performance is not described by a single number. So the above complication is an attempt to replace this deep knowledge of an architecture and an algorithm (user would have to quantify disk accesses, memory accesses etc) with a more experimental approach where a machine is characterized experimentally by running a number of expressions.

piccolbo added the enhancement label Feb 3, 2015

piccolbo mentioned this issue Feb 3, 2015

report test times #14

Closed

piccolbo added a commit that referenced this issue Feb 15, 2015

fix #3 #15

30765bd

piccolbo added a commit that referenced this issue Feb 16, 2015

import get_nanotime as part of #15

e202477

piccolbo added a commit that referenced this issue Apr 27, 2015

basic time limit assertion #15

3f4053f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support performance testing #15

support performance testing #15

piccolbo commented Feb 3, 2015

piccolbo commented Feb 23, 2015

piccolbo commented Apr 2, 2015

piccolbo commented Apr 2, 2015

piccolbo commented Apr 28, 2015

piccolbo commented Apr 28, 2015

piccolbo commented Apr 28, 2015

piccolbo commented Apr 28, 2015

piccolbo commented May 1, 2015

support performance testing #15

support performance testing #15

Comments

piccolbo commented Feb 3, 2015

piccolbo commented Feb 23, 2015

piccolbo commented Apr 2, 2015

piccolbo commented Apr 2, 2015

piccolbo commented Apr 28, 2015

piccolbo commented Apr 28, 2015

piccolbo commented Apr 28, 2015

piccolbo commented Apr 28, 2015

piccolbo commented May 1, 2015