How random is random?

How random is random?

Secure Home | Search | About
 General Computer Security    Post an article   get this group's latest topics as an RSS feed add this group's latest topics to your My MSN content add this group's latest topics to your My Yahoo content add this group's latest topics to your Google content
Subject Author Date
How random is random? Colin B. 05-11-2007
Posted by Colin B. on May 11, 2007, 3:12 pm
If you were  Registered and logged in, you could reply and use other advanced thread options
In the context of random numbers, how does one measure (or even properly
describe) randomness?

Recently we needed to generate several large files of random data for
test purposes. Since we're working in Unixland, it was easy enough to
do:

$ dd if=/dev/random of=<filename> bs=x count=y conv=sync.

Now assuming that we keep the filesize the same (i.e. x*y=constant),
the time to generate files goes up as count increases and bs decreases.
The interesting thing is that files created with low count and high bs...
        - compress much better
        - generate far fewer lines (as measured by wc -l)

Now since compress and gzip are apparently entropy-based algorithms, it
stands to reason (at least by me!) that the small-count file has less
entropy. The question is, what does this actually mean, and what are the
consequences of it?

Thanks,
Colin

Posted by on May 11, 2007, 4:36 pm
If you were  Registered and logged in, you could reply and use other advanced thread options
wrote:

> $ dd if=/dev/random of=<filename> bs=x count=y conv=sync.
>
> Now assuming that we keep the filesize the same (i.e. x*y=constant),
> the time to generate files goes up as count increases and bs decreases.
> The interesting thing is that files created with low count and high bs...
> - compress much better
> - generate far fewer lines (as measured by wc -l)
>
> Now since compress and gzip are apparently entropy-based algorithms, it
> stands to reason (at least by me!) that the small-count file has less
> entropy. The question is, what does this actually mean, and what are the
> consequences of it?

'conv=sync' tells 'dd' that if it gets a short read from its input
then it
should pad the output record to the specified blocksize with zeroes.
/dev/random can produce short reads if its entropy pool gets depleted.
If you examine the compressible output files I expect you'll find
that
they contain lots of runs of zeroes, and those runs of zeroes are
highly compressible.

This is also the reason why the large 'bs' causes the file to be
generated more quickly.

OttoM.
__
ottomeister

Disclaimer: These are my opinions. I do not speak for my employer.


Posted by Colin B. on May 11, 2007, 5:15 pm
If you were  Registered and logged in, you could reply and use other advanced thread options
ottomeister@mail.com wrote:
> wrote:
>
>> $ dd if=/dev/random of=<filename> bs=x count=y conv=sync.
>>
>> Now assuming that we keep the filesize the same (i.e. x*y=constant),
>> the time to generate files goes up as count increases and bs decreases.
>> The interesting thing is that files created with low count and high bs...
>> - compress much better
>> - generate far fewer lines (as measured by wc -l)
>>
>> Now since compress and gzip are apparently entropy-based algorithms, it
>> stands to reason (at least by me!) that the small-count file has less
>> entropy. The question is, what does this actually mean, and what are the
>> consequences of it?
>
> 'conv=sync' tells 'dd' that if it gets a short read from its input
> then it
> should pad the output record to the specified blocksize with zeroes.
> /dev/random can produce short reads if its entropy pool gets depleted.
> If you examine the compressible output files I expect you'll find
> that
> they contain lots of runs of zeroes, and those runs of zeroes are
> highly compressible.
>
> This is also the reason why the large 'bs' causes the file to be
> generated more quickly.

Ah hah! That explains some other behaviour I noticed after posting this,
namely that until a certain point, increasing bs (and decreasing count)
didn't seem to produce the behaviour I described.

Now that I actaully look at the output from dd, I can see the same thing--
0 full records and count partial records if bs is high enough (> 1040,
in this case).

But shouldn't /dev/random (on Solaris, BTW) block until it can fill the
request for whatever block size? Or can it only block between calls?

Thanks,
Colin

Posted by on May 11, 2007, 6:54 pm
If you were  Registered and logged in, you could reply and use other advanced thread options
wrote:
> ottomeis...@mail.com wrote:
> > wrote:
>
> >> $ dd if=/dev/random of=<filename> bs=x count=y conv=sync.
>
> Ah hah! That explains some other behaviour I noticed after posting this,
> namely that until a certain point, increasing bs (and decreasing count)
> didn't seem to produce the behaviour I described.
>
> Now that I actaully look at the output from dd, I can see the same thing--
> 0 full records and count partial records if bs is high enough (> 1040,
> in this case).
>
> But shouldn't /dev/random (on Solaris, BTW) block until it can fill the
> request for whatever block size? Or can it only block between calls?

The /dev/random device should block unless the app
has gone out of its way to turn on non-blocking behaviour.
It's unlikly that 'dd' would do that; certainly the Solaris
'dd' doesn't. The old Solaris /dev/random pipe, fed by
the cryptorand daemon, could return short reads just
like any other named pipe.

But in fact I was wrong, what's happening here has
nothing to do with blocking or entropy depletion. You're
being bitten by an undocumented quirk of the Solaris
/dev/random driver, which is that the amount of data
it will deliver in response to a single read() is capped.
That cap happens to be 1040 bytes.

I suppose it's within the driver's rights to do that but
when it's mixed with 'dd' like this the result is quite
unpleasant. There are easy workarounds once you
know what's happening (e.g. keep 'bs' below 1040,
or pipe the output of 'cat /dev/random' into 'dd', or
don't use 'dd') but before you can do that you have
to actually notice that something is broken. 'dd'
does tell you, in its own cryptic fashion, that the
input records were incomplete. I doubt I'd have
spotted that.

OttoM.
__
ottomeister

Disclaimer: These are my opinions. I do not speak for my employer.


Posted by Ertugrul Soeylemez on May 12, 2007, 7:16 pm
If you were  Registered and logged in, you could reply and use other advanced thread options

> $ dd if=3D/dev/random of=3D<filename> bs=3Dx count=3Dy conv=3Dsync.
>
> Now assuming that we keep the filesize the same (i.e. x*y=3Dconstant),
> the time to generate files goes up as count increases and bs
> decreases.

This is a buffering issue and has nothing much to do with the PRNG.


> The interesting thing is that files created with low count
> and high bs...
>         - compress much better
>         - generate far fewer lines (as measured by wc -l)

Then the PRNG in Solaris is messed up. Regardless of dd's transfer
block size, the compression rate should always be the same: around 0%.
Especially since /dev/random is supposed to be a character device, there
should be no difference.


> Now since compress and gzip are apparently entropy-based algorithms, it
> stands to reason (at least by me!) that the small-count file has less
> entropy. The question is, what does this actually mean, and what are the
> consequences of it?

That Solaris' PRNG is messed up, or that you did something wrong.


Regards,
Ertugrul S=C3=B6ylemez.


--=20
Security is the one concept, which makes things in your life stay as
they are. Otto is a man, who is afraid of changes in his life; so
naturally he does not employ security.

Similar ThreadsPosted
Random Internet Explorer Pop-ups. Please help!! April 12, 2005, 11:06 am
prompting for 3 random characters from a password May 13, 2004, 5:12 am
Free Random Password Generator January 26, 2005, 9:11 pm
Free Random Password Generator January 26, 2005, 9:13 pm
pseudo-random numbers, save state November 11, 2005, 10:05 pm

The site map in XML format XML site map

Contact Us | Privacy Policy