Let's suppose I write a program that sorts an list of integers. I'm using a bubble sort, which has polynomial time.

However, I want to speed up my sorting program. We know that there are logarithmic time sorting algorithms, such as merge or quick sort.

Knowing this, I could write a computer program that could speed up my program for me. I give it a constraint that it is functionally identical to my bubble sort program. That is, for every input x passed into my original program P that outputs some y, the new program P_i also outputs the same exact y. How I encode that constraint I do not know. But it's some kind of identity that has to be true.

I then go the very naive route, and have my computer begin enumerating all programs which meet that identify function AND give a speedup (that is, the new generated program sorts faster).

I know eventually, given enough time, my program generator will eventually generate some kind of logarithmic sorting algorithm.

Obviously, there's a lot of problems to solve: how to I enumerate valid programs, how do I encode my identity / speed up constraints, how do I know when to stop?

Besides optimizing that "search" for a faster program given one that works, will there ever be enough computational power on hand for the every day programmer to do stuff like this?

Note this doesn't just mean finding a lower complexity algorithm. It could be using operators or hacks/tricks that give a speed up to the program, e.g. using a Python generator instead of a list; or even replacing a 3rd party XML parser library with one that's slightly faster.

## Could a machine optimize language source code?

**Moderators:** phlip, Moderators General, Prelates

- roflwaffle
**Posts:**357**Joined:**Wed Jul 01, 2009 6:25 am UTC

### Re: Could a machine optimize language source code?

It sounds like you're describing Stoke. The only difference is that it has an optimization mode that lets it temporarily violate some of the hard constraints so it can further optimize some of the soft constraints. I think this lets Stoke move out of local minimums it may otherwise settle into under it's Synthesis mode.

https://www.youtube.com/watch?v=aD9mZDJzb58

https://theory.stanford.edu/~aiken/publ ... hkufza.pdf

https://www.youtube.com/watch?v=aD9mZDJzb58

https://theory.stanford.edu/~aiken/publ ... hkufza.pdf

### Re: Could a machine optimize language source code?

A compiler restricts itself to doing small-scale transformations which it knows cannot change the program's behaviour.

Stoke seems to apply small-scale transformations at random, afterwards checking for correctness.

You want to search the whole space of all possible programs for one that's identical and better than yours.

A few solvable problems:

* enumerating valid programs is easy: enumerate ascii strings and discard those that don't compile. You can skip 0 bytes, page feeds and other special characters if you wish to optimize, or you could enumerate concatenations of valid compiler tokens instead of concatenations of ascii letters, but the basic idea is the same.

* the number of valid programs is countable, but infinite. You could stop the search after, say, twice the size of your original program, because huge programs for simple tasks are rarely fast. Or just keep it running until you're satisfied. But you'll never be able to search the whole program space.

* the number of possible inputs you have to check is countable, but infinite. You will also need to restrict the size of the input data, e.g. restrict yourself to sorting sequences no longer than 1.000.000 numbers. Luckily, most programs can be discarded after the first few runs, since they won't output sorted sequences.

* It is not even possible to automatically determine whether a program will ever output something and exit (see: halting problem). To solve that, you need to place further restrictions. For example, you could ensure that your input program terminates on all inputs. Any candidate program that takes longer than the input program without exiting gets terminated and discarded. Of course you would need to be careful here, because a O(log(n)) algorithm is often slower at small n than a O(n) algorithm, so let them run a bit longer and don't discard them until you either have proof of inequality or a reasonable spectrum of benchmarks.

One unsolvable problem:

* We live in a universe with finite computing power. The amount of possible ASCII strings with less than 40 bytes is larger than the amount of atoms in the observable universe. Identically, if you wish to test all possible inputs with no more than 40 bytes, there are more input variations than atoms. Trying ascii-string of several kilobyte against megabytes of inputs would far exceed the available computing power of this whole universe.

Brute forcing will never work. And the theoretical computer scientists have proof that there is no better way to automatically prove program equality than brute forcing. If there was, we wouldn't have bugs in our code.

Stoke seems to apply small-scale transformations at random, afterwards checking for correctness.

You want to search the whole space of all possible programs for one that's identical and better than yours.

A few solvable problems:

* enumerating valid programs is easy: enumerate ascii strings and discard those that don't compile. You can skip 0 bytes, page feeds and other special characters if you wish to optimize, or you could enumerate concatenations of valid compiler tokens instead of concatenations of ascii letters, but the basic idea is the same.

* the number of valid programs is countable, but infinite. You could stop the search after, say, twice the size of your original program, because huge programs for simple tasks are rarely fast. Or just keep it running until you're satisfied. But you'll never be able to search the whole program space.

* the number of possible inputs you have to check is countable, but infinite. You will also need to restrict the size of the input data, e.g. restrict yourself to sorting sequences no longer than 1.000.000 numbers. Luckily, most programs can be discarded after the first few runs, since they won't output sorted sequences.

* It is not even possible to automatically determine whether a program will ever output something and exit (see: halting problem). To solve that, you need to place further restrictions. For example, you could ensure that your input program terminates on all inputs. Any candidate program that takes longer than the input program without exiting gets terminated and discarded. Of course you would need to be careful here, because a O(log(n)) algorithm is often slower at small n than a O(n) algorithm, so let them run a bit longer and don't discard them until you either have proof of inequality or a reasonable spectrum of benchmarks.

One unsolvable problem:

* We live in a universe with finite computing power. The amount of possible ASCII strings with less than 40 bytes is larger than the amount of atoms in the observable universe. Identically, if you wish to test all possible inputs with no more than 40 bytes, there are more input variations than atoms. Trying ascii-string of several kilobyte against megabytes of inputs would far exceed the available computing power of this whole universe.

Brute forcing will never work. And the theoretical computer scientists have proof that there is no better way to automatically prove program equality than brute forcing. If there was, we wouldn't have bugs in our code.

### Re: Could a machine optimize language source code?

Tub wrote:Brute forcing will never work. And the theoretical computer scientists have proof that there is no better way to automatically prove program equality than brute forcing. If there was, we wouldn't have bugs in our code.

Intel (and to a lesser degree, all other chip design companies) spends a great deal of time/money verifying that each chip does exactly what it is supposed to do. Of course, you still need to have both the verification and the spec bug-free (the reason pentium chips couldn't divide correctly was because there were bugs in both the design and the "proof" showing it was correct). My guess is that this is in fact the problem: you shouldn't design an optimizer to match the code. You should write a program that describes the output (example (sorted:= [index]<=[index+1] for range [0:index-1]), this gives the optimizer the ability to check correctness in minimal time. Note minimal time probably still means "brute force" for certain cases. My guess is that if the verifier doesn't quickly converge and declare the circuit correct, Intel just redesigns either the circuit or the verification spec-program.

The statement "for all possible" just screams "use a quantum computer". If someone managed to figure out how to "just use another layer of indirection" on such a beast, I would assume a new revolution in computer science similar to the invention of the transistor. Since this doesn't seem to be the case (only extremely specialized "programs" for quantum computers have been devised), I would assume that you would have to avoid the indirection issue (only one set of "all possible"). If you use the quantum computer to test "all possible" computer snippets you need to be able to show the correctness/shortness of the snippets using nothing but qbits. With a few loops (I have no idea if this is possible), it might be possible to *barely* reduce the issue as you have to loop through each variable and show that it is algebraically equal to the correct output. My guess is that this is where everything breaks down.

The gist of NP-complete requires that you can prove the correct answer correct*. The issue is finding it. While you *can* prove the correct answer correct here, it appears to be a slow process. Given a magical NP-complete solving machine (which solves your problem in "finite" time), it is entirely possible that the magical machine needs to be "larger" to complete the nasty final proof.

* I clearly don't get the definitions I've read for these things. The claim is that given the answer you can show it is correct, but it often leaves out the bit about the "answer" containing essentially infinite amounts of data. Unless there is a means of finding the length of the traveling salesman problem (without solving the actual problem), I have no way to tell from one given path if there exists a shorter path. If you give me the lengths of all possible paths it is easy to see the shortest one, but that isn't what I see in the given definitions.

### Re: Could a machine optimize language source code?

Jumping in just to respond to that last bit of confusion about the definition of NP:

The classes P, NP, etc. contain what are called "decision problems". An instance of a decision problem is a question with a yes/no answer, e.g., "Is this particular Boolean formula satsfiable?", "Does this particular graph have a Hamiltonian cycle?", etc. You're talking about a question -- "What is the shortest path visiting all nodes in this graph?" -- that does not have a yes/no answer, i.e., is not a decision problem. So you're right: it doesn't make sense to talk about that version of the question being in NP. What is in NP is the decision problem version: "Is there a path visiting all nodes of G that has total length less than k?", where one specifies not only the graph but also a path length cutoff. These two versions of the problem are, in a meaningful sense, as difficult as each other: if we have an algorithm for the first one, we can solve the second (by simply finding the shortest path and seeing whether its length is less than k), and if we have an algorithm for the second one, we can solve the first reasonably quickly using a binary search approach.

The classes P, NP, etc. contain what are called "decision problems". An instance of a decision problem is a question with a yes/no answer, e.g., "Is this particular Boolean formula satsfiable?", "Does this particular graph have a Hamiltonian cycle?", etc. You're talking about a question -- "What is the shortest path visiting all nodes in this graph?" -- that does not have a yes/no answer, i.e., is not a decision problem. So you're right: it doesn't make sense to talk about that version of the question being in NP. What is in NP is the decision problem version: "Is there a path visiting all nodes of G that has total length less than k?", where one specifies not only the graph but also a path length cutoff. These two versions of the problem are, in a meaningful sense, as difficult as each other: if we have an algorithm for the first one, we can solve the second (by simply finding the shortest path and seeing whether its length is less than k), and if we have an algorithm for the second one, we can solve the first reasonably quickly using a binary search approach.

Xenomortis wrote:O(n^{2}) takes on new meaning when trying to find pairs of socks in the morning.

### Re: Could a machine optimize language source code?

Just wanted to add that a quantum computer wouldn't simply be able to just test "all possible" things at once -- that would be a nondeterministic machine (almost). It's still unknown if the two models are equivalent, but it is suspected that NP complete problems cannot be solved by a quantum machine (whereas a nondeterministic machine does them in polynomial time by definition).

More on topic, since humans can optimize source code, sure, just simulate a human brain on your machine, teach it cs, and have it optimize code for us!

More on topic, since humans can optimize source code, sure, just simulate a human brain on your machine, teach it cs, and have it optimize code for us!

### Re: Could a machine optimize language source code?

wumpus wrote:The statement "for all possible" just screams "use a quantum computer".

That doesn't magically give you infinite processing power. It can reduce the complexity of your algorithm by a lot, but as I calculated above, you're so many orders of magnitude above the computation power of the universe that you'd need a major reduction to be feasible. IIRC the rule of thumb for quantum computing is to take the square root of your complexity. But you're checking O(2^n) programs against O(2^n) inputs where each computation takes up to O(BB(n)) steps. Square rooting either of those factors isn't going to put a dent in your problem.

I don't even think the problem is NP complete.

>-) wrote:More on topic, since humans can optimize source code, sure, just simulate a human brain on your machine, teach it cs, and have it optimize code for us!

Well, the OP had a requirement for correctness. I don't think a simulated human brain would be able to satisfy that.

### Re: Could a machine optimize language source code?

[quote="Tub"But you're checking O(2^n) programs against O(2^n) inputs where each computation takes up to O(BB(n)) steps. Square rooting either of those factors isn't going to put a dent in your problem.

[/quote]

Checking against the inputs is almost certainly a mistake (once you do that you hit the limits of the universe as you mention). That is the whole point of coding a means of testing the results instead of a naive example (unfortunately, for most non-trivial problems they are the same: see hardware verification). The next problem is that if the optimal program or the test program contain [non unrollable] loops, you have most of the issues of the halting problem (you don't have a true halting problem as you can ignore any result that takes more steps than some known code).

At this point we should ask ourselves if we should "merely" look for a wildly optimized source code instead of a perfect one. Consider the case of evolution: test a subset of modified code, check if any are correct (and more optimized) and repeat. This gets rid of the O(2^N) programs (much closer to O(N)). You will still likely need some sort of "magic box" to either check all inputs or otherwise verify identical operation. So far, no quantum computer can do this (if they could all private key crypto would be dead, as of now only public key crypto is in danger of high-qubit quantum computers).

So my answer is that you can make an extremely powerful machine language source code optimizer if and only if you can prove two programs equal in some reasonable amount of time. Note that this doesn't require that you be able to prove *all* equal programs equal, just find enough that if there exist more optimal programs, you will find "enough". My guess that this is only reasonably possible for non-looping code, which almost certainly won't contain enough optimal programs to make it worthwhile.

[/quote]

Checking against the inputs is almost certainly a mistake (once you do that you hit the limits of the universe as you mention). That is the whole point of coding a means of testing the results instead of a naive example (unfortunately, for most non-trivial problems they are the same: see hardware verification). The next problem is that if the optimal program or the test program contain [non unrollable] loops, you have most of the issues of the halting problem (you don't have a true halting problem as you can ignore any result that takes more steps than some known code).

At this point we should ask ourselves if we should "merely" look for a wildly optimized source code instead of a perfect one. Consider the case of evolution: test a subset of modified code, check if any are correct (and more optimized) and repeat. This gets rid of the O(2^N) programs (much closer to O(N)). You will still likely need some sort of "magic box" to either check all inputs or otherwise verify identical operation. So far, no quantum computer can do this (if they could all private key crypto would be dead, as of now only public key crypto is in danger of high-qubit quantum computers).

So my answer is that you can make an extremely powerful machine language source code optimizer if and only if you can prove two programs equal in some reasonable amount of time. Note that this doesn't require that you be able to prove *all* equal programs equal, just find enough that if there exist more optimal programs, you will find "enough". My guess that this is only reasonably possible for non-looping code, which almost certainly won't contain enough optimal programs to make it worthwhile.

### Re: Could a machine optimize language source code?

There exist optimal code generators for non-looping code. They aren't used in practice however because of their runtime.

For looping code the problem (of determining whether two programs are equivalent for some infinite set of inputs) is undecidable as others pointed out.

The parenthesis only apply if your input space is finite (in which case no loops are necessary).

For looping code the problem (of determining whether two programs are equivalent for some infinite set of inputs) is undecidable as others pointed out.

wumpus wrote:The next problem is that if the optimal program or the test program contain [non unrollable] loops, you have most of the issues of the halting problem (you don't have a true halting problem as you can ignore any result that takes more steps than some known code).

The parenthesis only apply if your input space is finite (in which case no loops are necessary).

- kaispencer
**Posts:**5**Joined:**Sun Feb 21, 2016 4:51 pm UTC

### Re: Could a machine optimize language source code?

Back at university my friend had a work in which he describes why machine can not replace people in such questinos. But I have to admit they can sometimes do exactly as they were needed in one of the three cases.

- Soupspoon
- You have done something you shouldn't. Or are about to.
**Posts:**2467**Joined:**Thu Jan 28, 2016 7:00 pm UTC**Location:**53-1

### Re: Could a machine optimize language source code?

kaispencer wrote:Back at university my friend had a work in which he describes why machine can not replace people in such questinos. But I have to admit they can sometimes do exactly as they were needed in one of the three cases.

Sorry, mild necro (of the OP). But this thread reminds me of two matters, both interestingly linking together by discussions with a fellow student, back in my university days.

Firstly, the possibility or 'compiling' directly from Z Specification, rather than merely using it to (as per its name) specify the way the code that one would write would work. It has been done (to various degrees), both before and after the above discussions, although one of the particular problems was (and, forgive me, I've totally forgotten how to write Z Specification, so this is Z-pseudocode at best!) having something like "UnsortedRecords(Index:N;FirstName:S;LastName:S) -Sort(LastName:S,ascending)-> SortedRecords(Index:N;FirstName:S;LastName:S)", which is then (theoretically!) implemented in the 'best' way.

(Note, I don't think my friend, back than, ever got this thing to do this, but it was the original idea. I was more interested in other things - which didn't work out so well, either!)

Depending on where the UnsortedRecords typically come from (something that the Z-Spec-Compiler might be able to track-back through the prior elements of specification being used), different sorts might be more useful than others, depending on how (and in what manner) it was 'unsorted' before. If UnsortedRecords was just an 'unblessed' previously sorted dataset (perhaps with an indeterminate number of append/prependations) then a bubbling (from the right direction) could work fastest, but if if UnsortedRecords was the concatenation of OldSortedRecords plus NewSortedRecords, then the process could even (perhaps by backtracking and altering the concatenation, if there were no actual strict 'expectations' of strict concatenations) just convert the original meld into an abbreviated form of shuffle-sort, between the two original stacks.

There'd naturally be the ability to mark in the specification (globally, or locally to particular subsections) whether results should perhaps be optimal with regards for best speed/lowest usage of memory/etc, depending on what was actually important. And where the data involved is derived from complex pathways and may rely upon external influences that the Z-Spec may have no knowledge of... e.g. do the users who input the information work through forms in the inbox as FIFO, LIFO, first sort them by surname themselves, during manual checking, etc.

...which, of course, only ever makes the optimised code as good as the specifiers and coders involved. The specifier might or might not realise the possibility of getting the operators to ease the final program, and the coder might or might not have put a large repertoire of useful sort methods into the optimising specification-coder.

And that leads into the second point, that to properly test a given programming system to all limits requires an ever more complex program, in a manner similar to the classic Halting Problem. Which rather puts the kibosh on making the best and most optimised program when you're also trying to use the 'rule of least power' to the output to ensure that the (potentially more powerful) automated author can always make a decision (or at least a sufficiently complete partial-decision) upon each and every point of possible optimisation. (But, probably by no coincidence, that's what the far more flexible human brain is trying to do, in moulding the restrictive and not intrinsically intuitive form of computer code.)

Or... you could restrict yourself to as much or as little as you can get away with in trying absolutely every single possible combination of code (within defined limits) against every single possible of inputs/interactions (again, you'd need to define limits, at least of time taken) and first of all weed out all 'failures' (don't start work, don't stop working, don't work correctly) and then find the best all-round working version for all (weighted?) possibilities of input. But then you're getting into (and beyond!) astronomical numbers of tests, unless you can tie down the variations sufficiently to at least set up a darwinian selection/mutation/refinement cycle and you're open to taking a chance as to whether you even get led down the right garden path to get a sufficiently useful end-product.

TL;DR; Machines can optimise source code, but only to limited degrees (typically by ensuring that values that are never ever to be used again aren't needlessly copied to storage, or re-ordering so that a register value is re-used straight away rather than copied off, replaced with another that needn't be worked on straight away, which is in turn copied off again so that the original value can be copied in again later), although they will often out-perform the more casual hacker-of-code by using their pre-programmed heat-seaking optimisation of pre-anticipated coding tricks, whilst the coder might use their intuition to come up with something new, but may also get blind-sided and miss an obvious work-around that they even had used elsewhere in the code...

(Most of the above has already been aluded to, elsewhere, so sorry for the repetition. Consider this post non-optimised. )

### Who is online

Users browsing this forum: No registered users and 5 guests