1381: "Margin"

This forum is for the individual discussion thread that goes with each new comic.

Moderators: Moderators General, Prelates, Magistrates

jpvlsmv
Posts: 85
Joined: Wed Jan 09, 2013 9:43 pm UTC

Re: 1381: "Margin"

Postby jpvlsmv » Mon Jun 16, 2014 3:57 pm UTC

Brian-M wrote:Unfortunately, this process has two flaws...

Firstly, you'll end up with lots of false positives, so not only will you end up with the information you started with, you'll also end up with lots of information that never existed before, and you may not be able to tell which information is the information that was originally encoded.


Yes. Among the "lots" of information that you can extract from your algorithm (that never existed before) is a nearly-identical copy of your original information which has a subtle flaw or bug that gives the exactly wrong answer.

There are an infinite number of programs whose source code has the same MD5 (and SHA-1 and whatever additional hash you want to use) as the GNU Hello World example, but that prints "Goodbye World" instead. But on the bright side, if you include the original file size in your compressed data, almost all of the files that match the hashes won't compile. But one might.

Or for this comic example, an infinite number of proofs that can be extracted from the margin have a subtle flaw that makes them invalid. I have a fascinating proof for this, but this post is too small to... oh, wait. Here, I've compressed it for you: -=> . <=-

--Joe

Paragon99
Posts: 16
Joined: Thu Sep 20, 2012 4:10 am UTC

Re: 1381: "Margin"

Postby Paragon99 » Tue Jun 17, 2014 2:54 am UTC

You could encode any amount of information with only two integer (n and l).
n is the nth digit of pi from which to start.
l is the number of digits to take.

Gee... I wonder how many digits those integers would have...

User avatar
eviloatmeal
Posts: 570
Joined: Thu Dec 11, 2008 9:39 am UTC
Location: Upside down in space!
Contact:

Re: 1381: "Margin"

Postby eviloatmeal » Tue Jun 17, 2014 6:55 am UTC

Paragon99 wrote:You could encode any amount of information with only two integer (n and l).
n is the nth digit of pi from which to start.
l is the number of digits to take.

Gee... I wonder how many digits those integers would have...

Philip Langdale already implemented this: Pi filesystem.
*** FREE SHIPPING ENABLED ***
Image
Riddles are abound tonightImage

User avatar
orthogon
Posts: 3057
Joined: Thu May 17, 2012 7:52 am UTC
Location: The Airy 1830 ellipsoid

Re: 1381: "Margin"

Postby orthogon » Tue Jun 17, 2014 7:58 am UTC

eviloatmeal wrote:
Paragon99 wrote:You could encode any amount of information with only two integer (n and l).
n is the nth digit of pi from which to start.
l is the number of digits to take.

Gee... I wonder how many digits those integers would have...

Philip Langdale already implemented this: Pi filesystem.

:D Triumphantly absurd. Using pi as a compression system is one thing (though looking at some of the discussion threads it looks like the "compressed" version is almost always bigger than the original), but using it for a filesystem brings the additional problem of random access: every time you make a change to one byte in any file, you have to repeat the search to find the entire FS contents in pi. Splendid.
xtifr wrote:... and orthogon merely sounds undecided.

User avatar
eviloatmeal
Posts: 570
Joined: Thu Dec 11, 2008 9:39 am UTC
Location: Upside down in space!
Contact:

Re: 1381: "Margin"

Postby eviloatmeal » Tue Jun 17, 2014 9:02 am UTC

orthogon wrote: :D every time you make a change to one byte in any file, you have to repeat the search to find the entire FS contents in pi. Splendid.

I think it stores files individually, so you would just have to find the "new" file in pi, not all of your files bunched together.

Perhaps defragging would make your computer slower! :lol:
*** FREE SHIPPING ENABLED ***
Image
Riddles are abound tonightImage

xtifr
Posts: 361
Joined: Wed Oct 01, 2008 6:38 pm UTC

Re: 1381: "Margin"

Postby xtifr » Thu Jun 19, 2014 11:23 pm UTC

eviloatmeal wrote:
orthogon wrote: :D every time you make a change to one byte in any file, you have to repeat the search to find the entire FS contents in pi. Splendid.

I think it stores files individually, so you would just have to find the "new" file in pi, not all of your files bunched together.

Perhaps defragging would make your computer slower! :lol:

It can't be fragmented, because it uses no storage. There's no way for the data to get out-of-order, since it's stored in pi, not on your HD. Defragging not only can't affect it, but is a meaningless concept with pifs.

Now the metadata storage is another matter. But the actual data? Frag-proof.
"[T]he author has followed the usual practice of contemporary books on graph theory, namely to use words that are similar but not identical to the terms used in other books on graph theory."
-- Donald Knuth, The Art of Computer Programming, Vol I, 3rd ed.

User avatar
eviloatmeal
Posts: 570
Joined: Thu Dec 11, 2008 9:39 am UTC
Location: Upside down in space!
Contact:

Re: 1381: "Margin"

Postby eviloatmeal » Fri Jun 20, 2014 8:05 am UTC

xtifr wrote:
eviloatmeal wrote:I think it stores files individually, so you would just have to find the "new" file in pi, not all of your files bunched together.

Perhaps defragging would make your computer slower! :lol:

It can't be fragmented, because it uses no storage. There's no way for the data to get out-of-order, since it's stored in pi, not on your HD. Defragging not only can't affect it, but is a meaningless concept with pifs.

Now the metadata storage is another matter. But the actual data? Frag-proof.

Well, yes, you wouldn't encounter fragmentation in the normal sense - two pieces of data can occupy the same digits of pi, so you would never need to break a file up into fragments. I mean in the sense of how much data you should store sequentially. Storing a large file, such as a film, sequentially would take a very long time. My assumption was that it would be faster to break such a large file into fragments and storing those individually. Perhaps that is wrong.

Also, for files that need to grow, if you're only adding data, I'm guessing it would be a lot faster to store the new stuff as a fragment, rather than re-storing the entire file in a different part of pi. Heck, even just storing the changes as fragments, and leaving the unchanged parts as "fragments of which we already know the location" might still be faster, depending on the file size.

Eh, what do I know, I'm no mathmatistician.
*** FREE SHIPPING ENABLED ***
Image
Riddles are abound tonightImage

xtifr
Posts: 361
Joined: Wed Oct 01, 2008 6:38 pm UTC

Re: 1381: "Margin"

Postby xtifr » Sun Jun 22, 2014 9:22 am UTC

eviloatmeal wrote:Well, yes, you wouldn't encounter fragmentation in the normal sense - two pieces of data can occupy the same digits of pi, so you would never need to break a file up into fragments. I mean in the sense of how much data you should store sequentially. Storing a large file, such as a film, sequentially would take a very long time.

Actually, it takes no time at all to store something in pi. The data is already there, and pi is essentially read-only, being an abstraction and non-physical, and a constant, and all. The only thing that takes time is finding where your data is already stored in pi. Which perhaps could be called the seek time. :)

My assumption was that it would be faster to break such a large file into fragments and storing those individually. Perhaps that is wrong.

As far as I can tell, that's absolutely correct, and is exactly what pifs does. Probably because otherwise, "writing" (seeking) a large file on pifs is an operation that could take years, or even centuries.

Also, for files that need to grow, if you're only adding data, I'm guessing it would be a lot faster to store the new stuff as a fragment, rather than re-storing the entire file in a different part of pi.

That might be true, but as far as I can tell (and I haven't actually read the code), pifs does not do that. If you append data to a file, it will "seek" the location of the modified final sector (since the files are broken up into sectors), until the sector is filled, at which point it will start a new sector.
"[T]he author has followed the usual practice of contemporary books on graph theory, namely to use words that are similar but not identical to the terms used in other books on graph theory."
-- Donald Knuth, The Art of Computer Programming, Vol I, 3rd ed.

User avatar
eviloatmeal
Posts: 570
Joined: Thu Dec 11, 2008 9:39 am UTC
Location: Upside down in space!
Contact:

Re: 1381: "Margin"

Postby eviloatmeal » Tue Jun 24, 2014 7:14 am UTC

xtifr wrote:Actually, it takes no time at all to store something in pi.

I suppose you have a point in that the word "store" is imprecise for the actual thing that is happening, which consists (mostly) of seeking, and doesn't involve a lot of writing (at least not in terms of changing the laws of nature to make the numbers in pi match your piece of data). But I was thinking of it in terms of the entire storage operation, which - I posit - must necessarily contain some kind of seek and some kind of write. If you were just writing stuff to your hard drive disk without worrying about where, then I think you would be writing a lot of love notes to your boot sector, and that wouldn't be particularly useful.

xtifr wrote:The data is already there, and pi is essentially read-only, being an abstraction and non-physical, and a constant, and all. The only thing that takes time is finding where your data is already stored in pi. Which perhaps could be called the seek time. :)

Like an immutable object - all you do is change the reference, not the value.

xtifr wrote:If you append data to a file, it will "seek" the location of the modified final sector (since the files are broken up into sectors), until the sector is filled, at which point it will start a new sector.

Close enough. My key point was that it would be silly to go looking for the entire new sequence when just a piece of it was changed, and the rest of it still matches its location in pi. Although according to the pifs readme, it stores every byte individually, which sounds inefficient in terms of saving space.

Also
Spoiler:
Image
Simon Plouffe
Damn this guy is handsome!
*** FREE SHIPPING ENABLED ***
Image
Riddles are abound tonightImage


Return to “Individual XKCD Comic Threads”

Who is online

Users browsing this forum: No registered users and 97 guests