Compute in deltas

So, for some years I’ve been stuck unable to figure out delta computing. I use the symbol \Game because it looks similar on my phone to the symbol I want. But here, I will use \Delta in place of \Game.

The small delta means difference between two programs \delta(p_1,p_2) is a program that when applied to the program p_1 produces another program that can takes any input, x, of p_1 to produce r_1=\delta(p_1,p_2)(p_1)(x), a result that is equivalent to the second program run on the same input and environment r_2=p_2(x) such that r_1\equiv r_2 for some useful definition of \equiv. This is the program difference(pd) between two programs.

The large delta the gives us the program differential operator(PD). \Delta(p,a) produces a function that can produce the change in p when a pd of its argument a is offered \delta(a_1,a_2). That is: \Delta(p,a)(\delta(a_1,a_2))\equiv \delta(p(a=a_1),p(a=a_2)) where the RHS partial evaluations are performed by partially specifying just the parametera and leaving the rest free.

An understanding of a pair of pd operator (\delta_1, \delta_2) allows for reversible change if \delta_1\circ \delta_2 is an appropriately typed identity function. A single \delta is an irreversible change. For one example, reading from a true random number generator would be an irreversible program. Inside the realm of a computer, simply reading an input from outside of the computer is an irreversible program within the computer, because it cannot affect the unpressing of the key. Even though outside of the computer we may know the reverse state of the “no” is the question“are you sure” from “rm -rf /“, the computer cannot know that for sure with its own faculties. That is to say, you cannot either, even if you are inside the computer and has access to just the computer memories and interfaces. Invertible pairs are intuitive, such as (sin(x),sin^{-1}(x)).

Our accessible realm of compute in an execution is therefore an accumulation of: (an initial state, irreversibly computed outputs, and the compute graph of reversible deltas) By modeling information this way, we can explicitly consider more general changes of states as well as give rise to a framework for understanding, interacting and developing software programs more effectively.

p.s. Btw, these ideas can be equally well expanded into operational and denotational semantics, each with their own idiosyncrasies.

p.p.s. Can we circumvent first order logic by currying functions instead of using \forall? Elsewhere I have worked out the reparameterization to achieve \forall_{a_1,a_2} \Delta(p,a)(\delta(a_1,a_2))\equiv \delta(p(a=a_1),p(a=a_2)). One of several example of this kind of reparameterization would be \Delta(p,\delta_a) \equiv \delta(p) each LHS and RHS now is requested to takes two parameters typed for a and yields a function that computes the pd of p when it’s parameter a changes from one to the second. To achieve the first order approximation effect of derivation in ordinary calculus on reals, all we need is to specify a loose \equiv^1, the first order equivalence, and so on. There are also sub-first-order equivalences such as: having at least same number of characters in the program code, that they are in the same language, etc. First order equivalence should minimally be a program having sufficiently compatible (-ly typed) input and outputs. Subsequently higher order equivalences include progressively more and more identical runtime behaviors or progressively more matching meaning. Here, again is another example of why presently described paradigm is beneficial: for example if a program is stochastic, how do we determine if another program is equivalent to it other than that the code is identical? By isolating the irreversible compute of receiving (from identical) external entropy, the remaining program can be evaluated in the f^{th} order using conventional \equiv^f. Further higher order equivalence may require that they have same runtime/memory/resource complexities. Which, btw, inspires an n^{th} ordering \geq^n that requires all equivalences \forall k<n \equiv^k and then at the n^{th} level require LHS to be better than RHS—such as lower runtime complexity, etc. The details of all these developments are documented more fully elsewhere.

p.p.p.s. Where is this headed? Well, aside from modeling the universe, one possibility is to achieve truly symbolic differentiation and do back-prop on program code. One can ask, for the PD to a program’s unit test wrt the program. We then pass in the pair (false,true) to arrive at the program (code) mutator that can repairs the input program to produce a program that causes the unit test to pass, after which we use higher ordering to search for a better program.

One can dream…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s