best way to store string of 0,s 1,s.

A place to discuss the implementation and style of computer programs.

Moderators: phlip, Moderators General, Prelates

ftfs
Posts: 34
Joined: Tue Mar 01, 2011 11:46 pm UTC

best way to store string of 0,s 1,s.

Postby ftfs » Thu Jun 09, 2011 11:15 pm UTC

hi,

so i want to store a sequence of zero's and ones in a file format, (i'm using fedora)
with as little over head as possible.

basiclly i have a string of 1679 zero's and 1's.

don't want to worry about rows, colums line ends, anything,

just want to save the info in a file, like so

00000010101010000000000001010000010100000001001000100010001001011001010101010101010100100100000000000000000000000000000000000001100000000000000000001101000000000000000000011010000000000000000001010100000000000000000011111000000000000000000000000000000001100001110001100001100010000000000000110010000110100011000110000110101111101111101111101111100000000000000000000000000100000000000000000100000000000000000000000000001000000000000000001111110000000000000111110000000000000000000000011000011000011100011000100000001000000000100001101000011000111001101011111011111011111011111000000000000000000000000001000000110000000001000000000001100000000000000010000011000000000011111100000110000001111100000000001100000000000001000000001000000001000001000000110000000100000001100001100000010000000000110001000011000000000000000110011000000000000011000100001100000000011000011000000100000001000000100000000100000100000001100000000100010000000011000000001000100000000010000000100000100000001000000010000000100000000000011000000000110000000011000000000100011101011000000000001000000010000000000000010000011111000000000000100001011101001011011000000100111001001111111011100001110000011011100000000010100000111011001000000101000001111110010000001010000011000000100000110110000000000000000000000000000000000011100000100000000000000111010100010101010101001110000000001010101000000000000000010100000000000000111110000000000000000111111111000000000000111000000011100000000011000000000001100000001101000000000101100000110011000000011001100001000101000001010001000010001001000100100010000000010001010001000000000000100001000010000000000001000000000100000000000000100101000000000001111001111101001111000

so that works, but it's ascii so each one zero is represented by an 8 byte' string of 0,1's.. oh dear.
so txt file no good,

any idea's,

basically i want to encrypt this info without any file headers and so forth so when it's decrypted the string is recoverable in it present form. (not as txt file but rather simple 1679 zero's and 1's binary), i guess i just want the sequence saved as 0's 1's on the hard disk platter just like it is now.

any and all idea's welcome.

cheers

User avatar
undecim
Posts: 289
Joined: Tue Jan 19, 2010 7:09 pm UTC
Contact:

Re: best way to store string of 0,s 1,s.

Postby undecim » Thu Jun 09, 2011 11:26 pm UTC

ftfs wrote:hi,

so i want to store a sequence of zero's and ones in a file format, (i'm using fedora)
with as little over head as possible.

basiclly i have a string of 1679 zero's and 1's.

don't want to worry about rows, colums line ends, anything,

just want to save the info in a file, like so

00000010101010000000000001010000010100000001001000100010001001011001010101010101010100100100000000000000000000000000000000000001100000000000000000001101000000000000000000011010000000000000000001010100000000000000000011111000000000000000000000000000000001100001110001100001100010000000000000110010000110100011000110000110101111101111101111101111100000000000000000000000000100000000000000000100000000000000000000000000001000000000000000001111110000000000000111110000000000000000000000011000011000011100011000100000001000000000100001101000011000111001101011111011111011111011111000000000000000000000000001000000110000000001000000000001100000000000000010000011000000000011111100000110000001111100000000001100000000000001000000001000000001000001000000110000000100000001100001100000010000000000110001000011000000000000000110011000000000000011000100001100000000011000011000000100000001000000100000000100000100000001100000000100010000000011000000001000100000000010000000100000100000001000000010000000100000000000011000000000110000000011000000000100011101011000000000001000000010000000000000010000011111000000000000100001011101001011011000000100111001001111111011100001110000011011100000000010100000111011001000000101000001111110010000001010000011000000100000110110000000000000000000000000000000000011100000100000000000000111010100010101010101001110000000001010101000000000000000010100000000000000111110000000000000000111111111000000000000111000000011100000000011000000000001100000001101000000000101100000110011000000011001100001000101000001010001000010001001000100100010000000010001010001000000000000100001000010000000000001000000000100000000000000100101000000000001111001111101001111000

so that works, but it's ascii so each one zero is represented by an 8 byte' string of 0,1's.. oh dear.
so txt file no good,

any idea's,

basically i want to encrypt this info without any file headers and so forth so when it's decrypted the string is recoverable in it present form. (not as txt file but rather simple 1679 zero's and 1's binary), i guess i just want the sequence saved as 0's 1's on the hard disk platter just like it is now.

any and all idea's welcome.

cheers


Since the number of bits you have (1679) is not divisble by 8, you will need to store that number in the file, unless that number will always be the same whenever you are using this format. (because file length is measured in bytes, or 8-bits at a time)

Where is this number coming from? Writing raw binary to a file is trivial in most programming languages, so if it's a script you're writing, there's probably a way to incorporate this into it.
Blue, blue, blue

ftfs
Posts: 34
Joined: Tue Mar 01, 2011 11:46 pm UTC

Re: best way to store string of 0,s 1,s.

Postby ftfs » Fri Jun 10, 2011 12:03 am UTC

hi, yea so the string will always be the same, both in number of 0's and 1's and there order. it is what it is.
when you say i will need to store that number, you meen, 8, or 1679?
sorry but don't really know much coding, this is more the end of a maths, crytpo fun tour.
so
file lenght is measured in bytes(8 bits) but this will have some left over?, this will be 209bytes and 7 bits. will the last 'bit' allways be there, in respect to the memory address.

i've write a few scripts but i'm no expert. but it sounds good that it's trivial. maybe i can do it in python..

the number comes from here
http://en.wikipedia.org/wiki/Arecibo_message

it's the Arecibo message

so being 1679 digits long is important as it's the product of two primes. which when factorized will give the dimensions of each side,

1679 = 73 rows by 23 columns

but this info should be found out by factorizing the number, not from the file. which is why it's important it's just 1679 digits long.

basically it will be weakly encrypted with an rsa public key with a small modulus so when the encrpted file is found along with the public key the modulas can be factorized easily to find the two primes it's made of and hence the privite key can be computed to crack the encrypt.

leading to the string of 1679 numbers, which can also be factorized to to get the 'image'

not really important, just geeks being geeks...

a test if you will..

but all help appreciated.

cheers

User avatar
undecim
Posts: 289
Joined: Tue Jan 19, 2010 7:09 pm UTC
Contact:

Re: best way to store string of 0,s 1,s.

Postby undecim » Fri Jun 10, 2011 1:10 am UTC

ftfs wrote:hi, yea so the string will always be the same, both in number of 0's and 1's and there order. it is what it is.
when you say i will need to store that number, you meen, 8, or 1679?


You would need to store 1679, unless that number changes. You can only store files in bytes, not bits, so you have an extra bit that would be either a 1 or a 0. Better yet, store the number of unused bits, so the number always uses only 1 byte.

Here's a simple python script that takes a string of 1's and 0's (it will blow up on any other character) from the first line of stdin, and writes the corresponding binary to stdout (padded with 0's at the end to finish the byte)

Save it to binstr2bin.py, and to use it, you can do something like "echo 0100100101101000 | python2 binstr2bin.py > examplefile" This will write"Ih" to examplefile, because that's what the binary string in the example is. If you use this with your example string, there will be an extra 0 at the end, because 1679 is 1 bit short of 210 bytes.

Code: Select all

#!/usr/bin/python2
import sys
output=''
s = raw_input()
while len(s) > 0:
        while len(s) < 8:
                s+='0'
        sys.stdout.write('%c' % int(s[:8], 2)),
        s = s[8:]



Here is the decode script. Save as bin2bintxt.py and use like "python bin2bintxt.py < bin2bintxt.py"

Code: Select all

#!/usr/bin/python2
import sys
def byte2binstr(n):
        b = ''
        for i in range(8):
                b = str(n % 2) + b
                n = n >> 1
        return b

b=sys.stdin.read(1)
while b != '':
        sys.stdout.write(byte2binstr(ord(b)))
        b=sys.stdin.read(1)
print
Blue, blue, blue

ftfs
Posts: 34
Joined: Tue Mar 01, 2011 11:46 pm UTC

Re: best way to store string of 0,s 1,s.

Postby ftfs » Fri Jun 10, 2011 2:25 am UTC

yip, yip, yip,

thats fantastic mate.

just what i was after.

i'm going to use the first 16 bits to encode the number 1679, and leave a clue that these bytes should be used to 'discover' the message.
i think that's the best way to do it.

cheers for your kind assist. it was great!

regards

:D

mcvoid
Posts: 24
Joined: Tue Jan 18, 2011 1:35 pm UTC

Re: best way to store string of 0,s 1,s.

Postby mcvoid » Fri Jun 10, 2011 12:52 pm UTC

One thing to look into is C's struct data type, which allows bit packing. There you can break down each run of bits into its own named member, and it takes care of byte/word boundaries automatically. The downside is that byte ordering is machine- and compiler-dependent, so it's mainly only useful in a single environment.

Moose Hole
Posts: 398
Joined: Fri Jul 09, 2010 1:34 pm UTC

Re: best way to store string of 0,s 1,s.

Postby Moose Hole » Mon Jun 13, 2011 5:18 pm UTC

The file might be too small, but you may be able to make it smaller by using lz compression (zip, rar, gz, etc.), if you're going for size.

User avatar
evilspoons
Posts: 98
Joined: Tue Aug 26, 2008 12:44 am UTC
Location: Edmonton, AB, Canada

Re: best way to store string of 0,s 1,s.

Postby evilspoons » Mon Jun 13, 2011 6:26 pm UTC

Incidentally, is there any particular reason this data needs to be stored in a file versus just hard-coding it into whatever needs to access it? I interpreted your earlier posts as "the string will never, ever change".

User avatar
evilbeanfiend
Posts: 2650
Joined: Tue Mar 13, 2007 7:05 am UTC
Location: the old world

Re: best way to store string of 0,s 1,s.

Postby evilbeanfiend » Mon Jun 13, 2011 9:36 pm UTC

what i dont understand from the descriptions so far is why it has to be as small as possible? the trivial ascii solution gives you < 2kB which is tiny by modern standards. the problem you are trying to solve seems to be to do with encryption, not compression? so why worry about a factor of 8 increase if the total size is still less than a billionth of a pence worth of disc space?
in ur beanz makin u eveel

User avatar
naschilling
Posts: 142
Joined: Wed Apr 06, 2011 2:52 pm UTC
Contact:

Re: best way to store string of 0,s 1,s.

Postby naschilling » Tue Jun 14, 2011 8:24 pm UTC

The simplest encoding I could think of would be an encoding similar to that used by fax machines. It's been many years, so I forget many of the details, but I still remember most of it.

Let's use a series of 0's and 1's 16 bits long: (This works for any length, but 16 is for demonstration.)
0000111100100100

We break it into sequencial segments:
0000 1111 00 1 00 1 00

Then we count how many consecutive characters there are of each.
4 4 2 1 2 1 2

Putting them all back-to-back, our sequence becomes:
4421212

Obviously there are a few catches:
1. How do we know what we started with?
We ASSUME we start with a 0. (Using actual data, this is generally true anyways). If this turns out to be false, we can handle this by starting with 0 0's. On decoding, putting no 0's in will create the same result.
2. What if we have more than 15 (F) consecutive of the same character?
This can be handled by a trick. If there is an F (15), the next numeral is the same as the current character and they add together. If there is ever exactly 15 of a numeral, then a 0 must come after to preserve values.
If you don't have walls, why would you need Windows?

walkietokyo
Posts: 3
Joined: Sun Jun 26, 2011 2:32 pm UTC

Re: best way to store string of 0,s 1,s.

Postby walkietokyo » Sun Jun 26, 2011 2:44 pm UTC

naschilling wrote:Putting them all back-to-back, our sequence becomes:
4421212


I think it's worth noting that your algorithm only works when most of the sequences of either zeros or ones are more than 4 in length.

You just took a 16-bit message and turned it into 28 bits. (7 words @ 4 bits each) :D

User avatar
naschilling
Posts: 142
Joined: Wed Apr 06, 2011 2:52 pm UTC
Contact:

Re: best way to store string of 0,s 1,s.

Postby naschilling » Tue Jun 28, 2011 9:00 pm UTC

As I mentioned, this is a version of the algorithm used by fax machines. It is made for cases where the data is predominately 0's and just small clusters of 1's. If the data is well mixed, it is worth looking at various block encoding functions.
If you don't have walls, why would you need Windows?


Return to “Coding”

Who is online

Users browsing this forum: No registered users and 7 guests