Posts filed under 'The Toolbox'

Generating test data files

Need to generate some test data? Here are a couple of quick and easy ways:

Binary data of arbitrary length:

dd if=/dev/random of=foo.txt bs=1024 count=100

This command will generate 100 kilobytes of binary data read from /dev/random. To get larger files, change the block size of block count as needed

But what if you want to generate ASCII text data? I wrote a little Perl script (called genfile) to do just that. It can generate random text or pseudo-English (frequency of letters are weighted similar to written English), variable or fixed length output, and you can specify the destination file size in records (lines) or size. Usage is as follows:

 USAGE:

 generate an ASCII file with user-specified characteristics
 suitable for use as test file

 genfile [-o OUTFILE] [-e] [-r NUM] [-z MAX] [-s]

 -o FILE          write output to file (default=data.out)
 -e               generate pseudo-English text (default is random text)
 -s               Suppress spaces in pseudo-English output
 -r NUM           generate fixed records (lines) with exactly NUM characters
                  (excluding end-of-line characters); if NUM is a negative
                  number, then the records will have random lengths with
                  a maximum of NUM characters. (default=80)
 -z MAX           generate a file with a maximum size of MAX
                  If MAX is a number, MAX records (lines) will be generated
                  MAX may also indicate a size in bytes, kilobytes,
                  megabytes, or gigabytes, by appending b, k, m, or g.
                  (Case is not significant.) For example,
                  45000B (45,000 bytes)
                  2048k (2,048 kilobytes)
                  12M (12 megabytes)
                  3.2G (3.2 gigabytes)
                  512k (512 kilobytes)
                  (default is 100k)

Some examples:

genfile

generates a random ASCII text file in outfile.txt that is 100k with 80 characters per line.

genfile -o gigantic.txt -e -r 80 -z 1G

writes a 1-gigabyte file called gigantic.txt that contains 80-character records of pseudo-English text.

genfile -s -r -100 -z 500

This example writes 500 variable length records (negative record length=variable) containing random characters. The -s option suppresses spaces in the output.

Add comment July 12th, 2007


Calendar

May 2008
S M T W T F S
« Jul    
 123
45678910
11121314151617
18192021222324
25262728293031

Posts by Month

Posts by Category