Mind Blowing Algorithms

A place to discuss the science of computers and programs, from algorithms to computability.

Formal proofs preferred.

Moderators: phlip, Moderators General, Prelates

Xanthir
My HERO!!!
Posts: 5400
Joined: Tue Feb 20, 2007 12:49 am UTC
Contact:

Re: Mind Blowing Algorithms

I wrote code almost identical to that for doing Project Euler problems with. I even did timing tests to find the format that would factorize a number fastest. I found that it was better, for each number, to store the *other* factor of each factor pair.

That is, if you're doing 2 (thus setting every 2nd value in the array to something), each number is 2*X, so you should store X at its array location. You can extract the prime factor with a single division, but you can use the value directly to go fetch the next value. I found, at least in my code, that this was faster than storing the prime factor, and requiring a division to know what the next array index should be. I suppose it's due to pipelining, as it could do the division and next array lookup concurrently?

This was just storing single numbers in the array, not strings representing the factorization, because I was actually trying to do work with it. ^_^
(defun fibs (n &optional (a 1) (b 1)) (take n (unfold '+ a b)))

jestingrabbit
Factoids are just Datas that haven't grown up yet
Posts: 5967
Joined: Tue Nov 28, 2006 9:50 pm UTC
Location: Sydney

Re: Mind Blowing Algorithms

Xanthir wrote:This was just storing single numbers in the array, not strings representing the factorization, because I was actually trying to do work with it. ^_^

Would storing a tuple, or having a second list, really be worse than the divisions?
ameretrifle wrote:Magic space feudalism is therefore a viable idea.

Flumble
Yes Man
Posts: 2227
Joined: Sun Aug 05, 2012 9:35 pm UTC

Re: Mind Blowing Algorithms

phlip wrote:
Flumble wrote:(and you've got a few code'os, namely [...] int((len(factor)-1)/n)+1; the -1 is correct if int is the ceiling function )

No, that's also correct. len(factor)-1 is the largest number that we're calculating the factors for, so (len(factor)-1)//n is the largest multiple of n we care about. The +1 is because python's range() function is left-inclusive (ie range(a,b) generates a list of a <= n < b).

Gotcha. I thought it was pseudocode with 1-based arrays and an inclusive range function.

Charlatan
Posts: 7
Joined: Wed Jan 07, 2015 8:12 pm UTC

Re: Mind Blowing Algorithms

One for me is O(n) sorting [radix sorting on strings or numbers], as opposed to a typical O(n log n) comparison sort that's used in most language sorting libraries.

Examples would be MSD sort, LSD sort, and three-way radix quicksort.

notzeb
Without Warning
Posts: 629
Joined: Thu Mar 08, 2007 5:44 am UTC
Location: a series of tubes

Re: Mind Blowing Algorithms

Hashlife algorithm
LLL lattice basis reduction algorithm
Ellipsoid method
Gröbner basis (Buchberger's algorithm)
Cylindrical Algebraic Decomposition
Zµ«V­jÕ«ZµjÖ­Zµ«VµjÕ­ZµkV­ZÕ«VµjÖ­Zµ«V­jÕ«ZµjÖ­ZÕ«VµjÕ­ZµkV­ZÕ«VµjÖ­Zµ«V­jÕ«ZµjÖ­ZÕ«VµjÕ­ZµkV­ZÕ«ZµjÖ­Zµ«V­jÕ«ZµjÖ­ZÕ«VµjÕ­Z

Xanthir
My HERO!!!
Posts: 5400
Joined: Tue Feb 20, 2007 12:49 am UTC
Contact:

Re: Mind Blowing Algorithms

jestingrabbit wrote:
Xanthir wrote:This was just storing single numbers in the array, not strings representing the factorization, because I was actually trying to do work with it. ^_^

Would storing a tuple, or having a second list, really be worse than the divisions?

It's worse on memory. I sized the list to just a little under where it started thrashing memory. Project Euler involves a *lot* of large-number factoring. The extra time from having to skip down the list was more than compensated by not having to start trial division as early.
(defun fibs (n &optional (a 1) (b 1)) (take n (unfold '+ a b)))

phillip1882
Posts: 141
Joined: Fri Jun 14, 2013 9:11 pm UTC
Location: geogia
Contact:

Re: Mind Blowing Algorithms

smoothsort is a pretty cool algorithm, though its not easy to code.

Code: Select all

`import randomA =  list(range(0,100))def bubbleDown(Tree,Size,A,i,j):    #sort the root node with the rest of the tree.    k = 1    while k < Size[j]:        child1 = i-1        child2 = i-Tree[Size[j]-k-1]-1        #determine the maximum of the two children nodes        if A[child1] > A[child2]:            maxchild = child1            k += 2        else:            maxchild = child2            k += 1        #sort if smaller, else break        if A[i] < A[maxchild]:            A[i],A[maxchild] = A[maxchild],A[i]            i = maxchild        else:            break        def inqueNode(Tree,Size,A):    #find all the heads, sort the most recent node with the heads.    heads = [Tree[Size]-1]    for j in range(1,len(Size)):        heads += [heads[j-1] +Tree[Size[j]]]    j = len(Size)-1    while j > 0 and A[heads[j]] < A[heads[j-1]]:        #if the recent node has no children, swap        if Size[j] == 1 or Size[j] == 0:            A[heads[j]],A[heads[j-1]] = A[heads[j-1]],A[heads[j]]            j-=1        #if the recent node has children,         #make sure that the node your swapping with is larger than both        elif (A[heads[j-1]] > A[heads[j]-1]        and A[heads[j-1]] > A[heads[j]-Tree[Size[j]-2]-1]):            A[heads[j]],A[heads[j-1]] = A[heads[j-1]],A[heads[j]]            j-=1        else:            break    i = heads[j]    #once the recent node is in the correct place, treesort.    bubbleDown(Tree,Size,A,i,j)    def dequeNode(Tree,Size,A):    #do the same thing as inque, except do it     #for the two children nodes created.    heads = [Tree[Size]-1]    for j in range(1,len(Size)):        heads += [heads[j-1] +Tree[Size[j]]]    j = len(Size)-2    while j > 0 and A[heads[j]] < A[heads[j-1]]:        if Size[j] == 1 or Size[j] == 0:            A[heads[j]],A[heads[j-1]] = A[heads[j-1]],A[heads[j]]            j-=1        elif (A[heads[j-1]] > A[heads[j]-1]        and A[heads[j-1]] > A[heads[j]-Tree[Size[j]-2]-1]):            A[heads[j]],A[heads[j-1]] = A[heads[j-1]],A[heads[j]]            j-=1        else:            break    i = heads[j]    bubbleDown(Tree,Size,A,i,j)    j = len(Size)-1    while j > 0 and A[heads[j]] < A[heads[j-1]]:        if Size[j] == 1 or Size[j] == 0:            A[heads[j]],A[heads[j-1]] = A[heads[j-1]],A[heads[j]]            j-=1        elif (A[heads[j-1]] > A[heads[j]-1]        and A[heads[j-1]] > A[heads[j]-Tree[Size[j]-2]-1]):            A[heads[j]],A[heads[j-1]] = A[heads[j-1]],A[heads[j]]            j-=1        else:            break    i = heads[j]    bubbleDown(Tree,Size,A,i,j)def smoothsort(A):    Tree = [1,1]    while Tree[-1] < len(A):        Tree += [Tree[-1] +Tree[-2] +1]    Size =     for i in range(1,len(A)):        if Size[-1] == 1:            Size +=         else:            Size +=         while True:            if (len(Size) > 2 and Tree[Size[-3]] +Tree[Size[-2]]             +Tree[Size[-1]] == Tree[Size[-3]+1]):                Size.pop()                Size.pop()                Size[-1] += 1            else:                break        inqueNode(Tree,Size,A)    for i in range(len(A)-1,1,-1):        v = Size.pop()        if v > 1:           Size += [v-1,v-2]        dequeNode(Tree,Size,A)random.shuffle(A)print A         smoothsort(A)print A   `
good luck have fun

resauce
Posts: 1
Joined: Thu Jan 15, 2015 6:36 pm UTC

Re: Mind Blowing Algorithms

Code: Select all

`def fib(n):   return fib(n-1)+fib(n-2) if n > 1 else n`

PM 2Ring
Posts: 3700
Joined: Mon Jan 26, 2009 3:19 pm UTC
Location: Sydney, Australia

Re: Mind Blowing Algorithms

resauce wrote:

Code: Select all

`def fib(n):   return fib(n-1)+fib(n-2) if n > 1 else n`

It works, but the double recursion is a killer. Besides, recursion & Python don't mix so well. Sure, you can do recursion in Python, but it's best avoided when it's practical to do so. See danger of recursive functions and Recursive dogma for details.

Here's a more Pythonic iterative Fibonacci function:

Code: Select all

`def fibo(n):    a, b = 0, 1    for i in xrange(n):        a, b = b, a + b    return a`

And here's one that uses the (so-called) Binet formula. Because it uses floating-point numbers it's not accurate for large n, but it's pretty fast since it doesn't use Python loops.

Code: Select all

`def fibo(n):    root5 = 5 ** 0.5    phi = 0.5 * (root5 + 1)    return int((phi ** n - (1 - phi) ** n) / root5)`

You could get even more speed out of it by making root5 a global constant. (And maybe make phi global, too).

Lopsidation
Posts: 183
Joined: Tue Oct 27, 2009 11:29 pm UTC

Re: Mind Blowing Algorithms

Casuals.

Code: Select all

`def fib(n):    if n<3: return [0,1,1][n]    if n%2==0: return fib(n/2+1)**2-fib(n/2-1)**2    return fib(n/2)**2+fib(n/2+1)**2x = fib(1000000) # shots fired`

PM 2Ring
Posts: 3700
Joined: Mon Jan 26, 2009 3:19 pm UTC
Location: Sydney, Australia

Re: Mind Blowing Algorithms

@Lopsidation

Yeah, ok. Maybe you should add memoization to that...

Yakk
Poster with most posts but no title.
Posts: 11115
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Mind Blowing Algorithms

Why? It samples things so sparsely that it would be wasteful.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

Lopsidation
Posts: 183
Joined: Tue Oct 27, 2009 11:29 pm UTC

Re: Mind Blowing Algorithms

I tried it, memoization speeds it up a lot. Adding memoization brings fib(10**7) from 7.896 seconds down to 0.278 seconds.

Apparently fib(10**7) makes 13333333 calls to fib, but it only calls fib on 79 distinct inputs.

Yakk
Poster with most posts but no title.
Posts: 11115
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Mind Blowing Algorithms

Ah.

Suppose our input is 4k. Then first iteration we call:

if n%2==0: return fib(n/2+1)**2-fib(n/2-1)**2
return fib(n/2)**2+fib(n/2+1)**2

2k+1 and 2k-1

These are both odd. In the 2k+1 case we call:

k and k+1

in the 2k-1 case we call:

k-1 and k

with memoization, after 2 iterations we went from 6 down to 5 different values.

This gets better after more iterations, as we sample +/-1 away from half our value. These +/-1s get less and less important as we iteratively halve them. After n iterations, we have about 2^n samples, but they are spread over +/-1 +/- 1/2 +/- 1/4 +/- 1/8 = +/- 2

By the pigeon hole principle, there can be at most 5 distinct samples.

So with memoization, we get ~ 5 lg n distinct values evaluated, without we get ~2^lg n distinct values evaluated.

13333333 is ~= 1e7 so that matches the actual test.

5 lg e7
lg e1 =~ 10/3 so 5lge7 =~ 117

The 79 is just a touch under that. And 5lgn was an upper bound (based on how many pigeons), so that is reasonably close.

Me thinking it was "sparse" was a screwup -- the sampling *converged* with more iterations due to the 1/2 multiplier.

If you pass in two distinct unrelated inputs, their distance collapses by 1/2 each time. So the number of steps it takes for their memoization to hit each other is roughly lg (difference) assuming they are within a factor of sqrt(2) of each other I think? Minus a term to account for the 5 wide "band" around the halving that the pigeons live in.

For numbers further apart, take the number of halvings for them to be within sqrt(2) of each other, plus the log of the difference there, before they intersect.

So for two random numbers, we get (lg larger number)-(lg smaller number) for the number of iterations for the big one to get "close" to the smaller number. Then we end up with two values that are within sqrt(2) of each other. Their difference will be about 1/4ish of their value. The log of this is going to be lg(smaller number)-2. Add this to the above and we get lg(larger number)-2.

And the number of iterations is about lg(larger number) minus some constant term. Ie, for two unrelated values, I suspect the cross-memoization won't help much. If that holds up under experimentation, you can set up the memoizier to discard its cached state (after an "external" call) rather than keeping it around.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

Qaanol
The Cheshirest Catamount
Posts: 3069
Joined: Sat May 09, 2009 11:55 pm UTC

Re: Mind Blowing Algorithms

If you want the cached values to be generally-applicable, you can just store fib(2n) and fib(2n-1), then combine them per the bits of n. I don’t speak Python fluently, but here’s my (probably highly inefficient) attempt at code:

Edit: slightly refined code

Code: Select all

`__fib = {0:[0,1]} # stores pairs [fib(2**n - 1), fib(2**n)]def fibcache(n):    if not n in __fib:            prev = fibcache(n-1)            __fib[n] = [prev**2 + prev**2, (prev+prev)**2 - prev**2]    return __fib[n]def fib(n):    parity = n % 2    result = fibcache(0)    i = 0    while n > 0        i = i + 1        n //= 2        if n % 2 > 0:            temp = result # is this line necessary?            cache = fibcache(i)            result = [cache*temp + cache*temp, \                      cache*temp + cache*(temp+temp)]    return result[parity]`

Original version in spoiler:
Spoiler:

Code: Select all

`__fib = {0:[0,1]} # stores pairs [fib(2**n - 1), fib(2**n)]def fib(n):    if n < 2: return __fib[n]    parity = n % 2    result = __fib    i = 0    while n > 0        i = i + 1        if not i in __fib:            __fib[i] = [__fib[i-1]**2 + __fib[i-1]**2, \                       -__fib[i-1]**2 + (__fib[i-1]+__fib[i-1])**2]        n //= 2        if n % 2 > 0:            temp = result # is this line necessary?            result = [__fib[i]*temp + __fib[i]*temp, \                      __fib[i]*temp + __fib[i]*(temp+temp)]    return result[parity]# obligatory "I wrote this at 2am" disclaimer`

Lopsidation’s version is likely faster though, and definitely more elegant.
wee free kings

Derek
Posts: 2180
Joined: Wed Aug 18, 2010 4:15 am UTC

Re: Mind Blowing Algorithms

So based on Yakk's analysis of Lopsidation's algorithm with memoization, the runtime is O(log n) = O(N), with O(log n) = O(N) memory (where n is the input and N is the number of bits in the input).

Of course, the output is O(phi^n), or O(2^N) in bits. At that point you have to start considering the runtime of your arithmetic operations (O(N) for addition, O(N^2) or better for multiplication), otherwise you end up with silly conclusions like needing less memory to compute than the result takes, or less time to compute than it takes to write the result to memory.

So going back, our recurrence relation is T(N) = O(2*T(N-1)) + 2*f(O(2^N)) + O(2^N), where f(x) is the runtime of our multiplication algorithm. This recurrence is going to be dominated by the last two multiplications, so the total runtime should be O(f(2^N)), which is O(2^(2N)) for standard multiplication, and O(2^N * log 2^N * log log 2^N) = O(2^N * N * log N) for the fastest multiplication algorithms.

Yakk
Poster with most posts but no title.
Posts: 11115
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Mind Blowing Algorithms

Technically the O( n log(n) 2^Θ(log*(n)) ) multiplication algorithm is faster than that, but nobody uses that algorithm for good reason ("impractically large inputs" required for it to beat a basic discrete FFT solution).
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

mikael
Posts: 28
Joined: Mon Feb 16, 2015 6:56 pm UTC
Location: Avignon, France
Contact:

Re: Mind Blowing Algorithms

Yakk wrote:Technically the O( n log(n) 2^Θ(log*(n)) ) multiplication algorithm is faster than that, but nobody uses that algorithm for good reason ("impractically large inputs" required for it to beat a basic discrete FFT solution).

To come back to the topic: the FFT itself is one algorithm that blew my mind. And then, using it for multiplication...

Also, the O(n) median-finding algorithm, when I understood why it actually ran in O(n).

And then, when my mind was smaller and more easily blown, Bresenham's line algorithm.

Yakk
Poster with most posts but no title.
Posts: 11115
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Mind Blowing Algorithms

mikael wrote:
Yakk wrote:Technically the O( n log(n) 2^Θ(log*(n)) ) multiplication algorithm is faster than that, but nobody uses that algorithm for good reason ("impractically large inputs" required for it to beat a basic discrete FFT solution).

To come back to the topic: the FFT itself is one algorithm that blew my mind.

Heck, the Fourier Transform, let alone the FFT.
And then, using it for multiplication...

It is less crazy if you think about using it for convolution rather than multiplication.

Multiplication in base-N is multiplication of polynomials (with some futzing at the end), and polynomial multiplication is very convolution like.

And A convolve B = FT( FT( A ) pointwise_multipliy FT( B ) )

Also, the O(n) median-finding algorithm, when I understood why it actually ran in O(n).

This one? http://stackoverflow.com/a/251884/1774667

I like the linked presentation of the algorithm.

To find the pseudo-median: Break your list into groups of 5. Take the median of each group (constant each group), then the median of the medians (via this algorithm on a much smaller set of elements). (guarantee: it is going to be reasonably close to the median)

Partition by the pseudo-median. Then recurse into the lower or higher partition (or onto the pivot, if lucky) as indicated by what element you want.

Then the cost is f(n) = (constant)*(n/5) (medians of 5s) + f(.2n) (median of medians) + f(.7n) (recursive step), which is linear.

The hard part to prove is the guarantee. Each median is going to be greater-equal than 3 of the elements of its 5 (including itself), and the median of those is going to be greater-equal than half of the 5s medians. So you are going to be at least 30% of the way through the entire list. And ditto on the other side. So your pseudo-median is going to be in the (30% - 70%) range of the original collection.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

mikael
Posts: 28
Joined: Mon Feb 16, 2015 6:56 pm UTC
Location: Avignon, France
Contact:

Re: Mind Blowing Algorithms

Yakk wrote:Heck, the Fourier Transform, let alone the FFT.

I don't believe the Fourier Transform qualifies as an algorithm, but it's beautiful alright. I rely on it whenever I need to think about signals, and that's pretty often. This whole idea of switching between multiplication and convolution... it just makes signals so simple to understand.

For example, I recently discovered voice inversion: http://www.windytan.com/2013/05/descrambling-voice-inversion.html

And I was like, what? Multiply every other sample by -1? How does that inverse the spectrum? So I took a short trip to the land of convolution with pulses and combs, and it all became clear.

... polynomial multiplication is very convolution like.

Yes, that: I'm not even sure I understand it. You see, I'm familiar with convolutions in the context of signals, but to see it applied to polynomials is still alien to me. I don't work with large numbers or polynomials nearly as often as I work with signals. And yet, they share a lot of algebraic structure. I know I'm lacking in the algebra department, but I'm working on it.

Also, the O(n) median-finding algorithm, when I understood why it actually ran in O(n).

This one? http://stackoverflow.com/a/251884/1774667

Yes, that's the one. I remember that the general framework was called "prune-and-search", and that was quite a bit more challenging for me than divide-and-conquer I like the linked presentation of the algorithm.

That diagram with the groups of five elements and the medians is what I remember, yes.
And thank you for the proof of cost, but I'm pretty sure it won't just come back to me like that on a friday night Yakk
Poster with most posts but no title.
Posts: 11115
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Mind Blowing Algorithms

Spoiler:
mikael wrote:
Yakk wrote:Heck, the Fourier Transform, let alone the FFT.

I don't believe the Fourier Transform qualifies as an algorithm, but it's beautiful alright.

Sort of. Convolve your signal with a family of sinusoidals. That can be done "naively" and you have a Fourier Transform.

(sure, infinite precision and points).
I rely on it whenever I need to think about signals, and that's pretty often. This whole idea of switching between multiplication and convolution... it just makes signals so simple to understand.

Now, the fun part is that the family of sinusoidals is not the super special bit of the fourier transform.

Breaking a signal down via convolution can be done with many families of functions (if not as pretty as sinusodials).

We might "hear" sinusoidal like sound because of how our ears are designed (as a tube that detects resonant frequencies). But I don't think the physics of air makes them particularly "super special the real deal"?

... polynomial multiplication is very convolution like.

Yes, that: I'm not even sure I understand it. You see, I'm familiar with convolutions in the context of signals, but to see it applied to polynomials is still alien to me. I don't work with large numbers or polynomials nearly as often as I work with signals. And yet, they share a lot of algebraic structure. I know I'm lacking in the algebra department, but I'm working on it.

You take two sets of values A = {...,a_i,a_i+1,... } and B={...,b_i,b_i+1,...} and you put your finger down on both of them, then you multiply. Then you slide your finger in the opposite direction, and multiply again. That is the basic "convolution" operation. Where your fingers "started" (well, the sum of the coordinates) is where the sum goes.

To find out the value somewhere else, move your fingers to a different "starting sum".

For continuous things, this ends up with integration. For discrete things, it is a sum.

This rather obviously lines up with a polynomial, where the a_i and b_i are the coefficients of the x^i term. If C = A*B, then c_i = sum( a_i * b_0 + a_i-1 + b_1 + ... + a_0 b_i )
because a_i x^i * b_0 x^0 is one of the contributors to the (a_i * b_0) x^i term. And same for a_i-1 x^i-1 * b_1 x^1, etc.

Once you get (base 10 digit representations of a number are just a polynomial where x=10) and (polynomial multiplication is a form of convolution), using Fourier (which makes convolution multiplication faster by turning it from a seeming O(n^2) to a O(n) problem) to make multiplication fast ... is less surprising.

I still don't quite get the finite roots of unity FFT, nor how FFT gives us a fourier transform, but baby steps.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

mikael
Posts: 28
Joined: Mon Feb 16, 2015 6:56 pm UTC
Location: Avignon, France
Contact:

Re: Mind Blowing Algorithms

Yakk wrote:We might "hear" sinusoidal like sound because of how our ears are designed (as a tube that detects resonant frequencies). But I don't think the physics of air makes them particularly "super special the real deal"?

I think they are "super special", but in a purely non-physical sense: they are the functions that completely "forget" about time. And that's why the FT transforms a signal based on time into a signal that "knows" nothing about time and is instead based of frequencies.

That's the main reason I don't like "windowed FFT" that arbitrarily mix the time and frequency domains. I know they are at the heart of most DSP applications, but I tend to view them with distrust, compared to the beautiful "pure" FT.

... you put your finger down on both of them, then you multiply. Then you slide your finger in the opposite direction, and multiply again. That is the basic "convolution" operation. Where your fingers "started" (well, the sum of the coordinates) is where the sum goes.

You literally just described the way I multiply vectors and matrices. So OK, I need to study the convolution operation in a more abstract setting, not just in the context of audio signals.

And think I just got the "polynomial multiplication is a form of convolution", thanks to your explanation. I really need to get back to my algebra textbook and reread the chapter on rings.

I still don't quite get the finite roots of unity FFT, nor how FFT gives us a fourier transform

Here's how I understand it: first, you start with the definition of the general FT. Then you restrict your domain to be both discrete and finite, which, by the properties of the FT also makes your codomain discrete and finite. This defines the Discrete Fourier Transform, or DFT.

Now the way you compute a DFT is really simple: it's just the multiplication of a vector by a matrix, giving a new vector. The argument and result of the DFT can be represented as vectors precisely because your domains are discrete and finite.

So what's inside this matrix? Simply roots of unity. I usually write the matrix down using small circles with a tick that represent e^ix: for x=0, the tick is on the right, for x=pi/2, it's on top, etc.

Just like that: http://en.wikibooks.org/wiki/Digital_Signal_Processing/Discrete_Fourier_Transform#Visualization_of_the_Discrete_Fourier_Transform

And so, the DFT seems to require n² operations. Except that, because of the structure of this matrix, it can be cleverly split in two for a divide-and-conquer attack, and that's the FFT algorithm.

Yakk
Poster with most posts but no title.
Posts: 11115
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Mind Blowing Algorithms

I mean the number theoretic roots of unity, where you do the math over some finite field instead of C.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

mikael
Posts: 28
Joined: Mon Feb 16, 2015 6:56 pm UTC
Location: Avignon, France
Contact:

Re: Mind Blowing Algorithms

Yakk wrote:I mean the number theoretic roots of unity, where you do the math over some finite field instead of C.

Yeah, see? That's the problem with always trying to visualize concepts like I do: you get stuck at a fairly low level of abstraction. Number theory still flies way over my head.

And so do group theory and category theory, which are my two main interests at the moment. But that's for another forum, I guess.

gcgcgcgc
Posts: 47
Joined: Sun Nov 10, 2013 1:18 pm UTC

Re: Mind Blowing Algorithms

Bzip and the Burrows-Wheeler Transform, in particular the decode (along with the whole idea of a reversible sort!)

PM 2Ring
Posts: 3700
Joined: Mon Jan 26, 2009 3:19 pm UTC
Location: Sydney, Australia

Re: Mind Blowing Algorithms

gcgcgcgc wrote:Bzip and the Burrows-Wheeler Transform, in particular the decode (along with the whole idea of a reversible sort!)

Mind blown.

The BW matrix operations vaguely remind me of an old card trick. I can't remember the details of the trick, but I guess I could reconstruct it from the BWT algorithm...

elasto
Posts: 3750
Joined: Mon May 10, 2010 1:53 am UTC

Re: Mind Blowing Algorithms

Seconding mind being blown.

Who'd have thought you could have a reversible sort...

Story
Posts: 78
Joined: Wed Aug 26, 2009 9:03 pm UTC

Re: Mind Blowing Algorithms

troyp wrote:I'm sorry, I have to ask: can you think of a *more* obvious and straightforward way to count the number of bits in a int than the one you quoted? Because personally, I would find *that* amazing.

In Python, the idiomatic way to count bits is

Code: Select all

`bin(x).count('1')`

walkerm930
Posts: 69
Joined: Wed Apr 07, 2010 3:53 am UTC

Re: Mind Blowing Algorithms

I'd always thought that palindrome finding had to be quadratic time, but you can do it in linear: http://stackoverflow.com/a/10468753/2256231 using Manacher's algorithm.

The symmetry stuff makes some sense after thinking about it, but making it linear time, that's crazy.
In the gospel according to trig there are 3 primary rules: sin θ = x/h , cos θ = y/h and tan θ = x/y. These rules are not open to interpretation and are to be treated as law.

quintopia
Posts: 2906
Joined: Fri Nov 17, 2006 2:53 am UTC
Location: atlanta, ga

Re: Mind Blowing Algorithms

jestingrabbit wrote:You can use this sort of trick to make a doubly linked list with just a single pointer at each node: make that pointer the xor of the next and prev and keep two pointers to tell you where you are, a here and a next. Again, its doable, but its very much in the "too clever" basket.

Too clever or not, there are situations where this is exactly what you should do. For instance, if you need to work with a doubly-linked list in an embedded system. In particular, anywhere the following conditions hold, you should consider doing it:

1) Memory availabilty is tight or memory usage is very high. (So it becomes worthwhile to add a constant amount of extra computation and slightly decrease the code readability in exchange for a linear decrease in memory usage.)
2) Memory management is manual. (In other words, you're using C. Doing this in a garbage-collected language would be pretty stupid, since you'd need to save a reference to every cell anyway.)
3) The usual conditions on when you'd use a linked list at all.

Yakk
Poster with most posts but no title.
Posts: 11115
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Mind Blowing Algorithms

3) The usual conditions on when you'd use a linked list at all.

So, almost never? One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

quintopia
Posts: 2906
Joined: Fri Nov 17, 2006 2:53 am UTC
Location: atlanta, ga

Re: Mind Blowing Algorithms

True, there's not too many situations in which one uses a deque out there in the "real world".

Jplus
Posts: 1721
Joined: Wed Apr 21, 2010 12:29 pm UTC
Location: Netherlands

Re: Mind Blowing Algorithms

You don't need a linked list for a deque.
"There are only two hard problems in computer science: cache coherence, naming things, and off-by-one errors." (Phil Karlton and Leon Bambrick)

coding and xkcd combined

(Julian/Julian's)

quintopia
Posts: 2906
Joined: Fri Nov 17, 2006 2:53 am UTC
Location: atlanta, ga

Re: Mind Blowing Algorithms

A deque is a natural use case for a DLL.

Yakk
Poster with most posts but no title.
Posts: 11115
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Mind Blowing Algorithms

Linked-node data structures are almost always horrible ideas. Every step when traversing them is a nearly guaranteed cache miss.

For non-location identity objects, circular buffer that grows when it runs out of room.

Or, in short, the default container you should use is a std::vector. If your problem isn't suitable for using a std::vector, change the problem.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

quintopia
Posts: 2906
Joined: Fri Nov 17, 2006 2:53 am UTC
Location: atlanta, ga

Re: Mind Blowing Algorithms

But again, we were talking about very small systems with very limited memory. You're not going to have huge lists that always miss the cache. Many embedded systems use fixed-size block dynamic memory allocation using linked lists.

EvanED
Posts: 4331
Joined: Mon Aug 07, 2006 6:28 am UTC
Contact:

Re: Mind Blowing Algorithms

Yakk wrote:Linked-node data structures are almost always horrible ideas. Every step when traversing them is a nearly guaranteed cache miss. ... Or, in short, the default container you should use is a std::vector. If your problem isn't suitable for using a std::vector, change the problem.
This is something that I struggle with, or would if I felt like I had an opportunity to improve what we have. I just don't know how. What we need to do for dataflow analysis is basically propagate a map around a graph. Each time we visit a node, we make a change to a very small number of map entries. We need to keep around the map before the change as well though, both so that we can retrieve the information from it later (we need answers for each node, not just some final sink node or whatever) as well as detect when the propagation quiesces.

What we use now is an applicative tree -- you can make a change to one node in the tree and all you have to do is copy everything on the path from the root of the tree to that one node (perhaps one node to the side because of rotations); most of the changed tree can share structure with the tree before the change. (Just as if in, say, lisp, you can say something like (define (replace-head-with lst newhead) (cons newhead (cdr lst))) and (replace-head-with '(2 3 4 5 ...) 1) will give you (1 3 4 5 ...) without needing to copy the (3 4 5 ...) tail. If you don't have an API for changing the tail, you can never tell if it was copied or not.) But the tree means it's a linked structure. I don't know how to make an applicative data structure for this that isn't linked. The only thing I can think of that might rival what we have is to use a sorted vector, but you're talking about needing to copy the entire structure every propagation step.

I actually think it'd be cool to try the sorted vector thing and see if it helps at all, but it has a big problem of being legitimately algorithmically worse. (And not in the same sense as the Stroustrup list/vector article you posted a while back where the apparent rise in complexity turned out to be just a step in the process and not raise the complexity of the overall process; here it would.) I don't know if the maps get big enough that it would matter. I probably don't have time to find out either; it would probably be a few hours to even get to the point of a solution that is horribly unfair to the vector solution (have many copies per propagation step) and probably infeasible to get a good implementation even if I had a month.