Filsplit

PURPOSE    OPERATION   COMMAND LINES   OPTIONS   RELATED PROGRAMS


Author: Dan Mares, dmares @ maresware . com (you will be asked for e-mail address confirmation)
Portions Copyright © 1998-2021 by Dan Mares and Mares and Company, LLC
Phone: 678-427-3275

One liner: Effectively split files into smaller files. "Split out"/Extract "sample" records

Sample Maresware Batches  an executable with data that demonstrates various Maresware software. Download and run the appropriate _10_xx batch for filsplit demo.

All programs are command line programs.
MUST be run within a command window as administrator.


top

Purpose

Filsplit breaks a file into pieces. It allows you to copy a section of records from an input file and place them to an output file. The section of data which is selected can be a chunk of records from within the file, a random sample of every ‘n’th record, or a specific number of characters.

The sections split can then be used as a “SAMPLE” of the original file which can be used to test processing procedures.

Records are split according to command line options input by the user.

In special instances, you can trick the program into using a false record size in order to copy the correct number of characters to the output. You do not have to use the actual record size, but can consider any number of characters as a record.


top

Operation

By use of the command line, the user specifies which section of records are needed. The user inputs: a record length (not necessarily the actual record length, but one the program uses to calculate how many actual characters to copy); a beginning record number; and a number of records to copy.

From the information given on the command line, the program calculates at what point in the file to start copying characters, and how many characters to copy to the output file.

If the option of every ‘n’th record is chosen, the program selects every ‘n’th record to the output. This option is used to select a sample from the file.

Another option is to select records based on the numbers in a file containing random numbers. The file containing the random numbers should have one number per line, and not have any line longer than 12 characters. The name of the file containing the random numbers is given to the program through an option (-f), for "file." The file containing the random numbers does not have to be sorted, but sorting might create an output with records that can later be associated to the input more easily. The program cannot distinguish between multiples in the random number file, so it is best to eliminate duplicates from the random file before running the program. If you don’t, the output will have duplicates in it.

Versions of Filsplit with version numbers greater than 2.xx and dated after February 2002 will work on large NT files greater than the traditional 2 GIG WIN98 limit.


top

Command Lines

C:> filsplit input.fle output.fle -[options[rabcdefls]]

C:> filsplit  myfile.in  yourfile.out  -r  200  -b 50  -c 10

200 == record length
50  == begin at record no 50
10  == copy 10 records to the output

C:> filsplit  myfile  outfile  -r 80 -b  16000 -c 20

80    == record length
16000 == begin at this record number (based on record length of 80)
20    == copy 20 records based on 80 char. reclen

C:> filsplit  myfile  output   -d 20000 -D 16000

20000 == begin at this character number and copy 16000 characters.
16000 == copy 16000 characters

C:> filesplit  myfilein  yourfileout  -r 100  -s 10

100 == input record length [max of 32768]
-s 10 == (sample) place every 10th record in the output file


top

Filsplit Options

On the following lines, the # should be replaced by an appropriate value. In the -d and -b options the values start at 0 not 1.

-a   Append to an existing output file

-r #   Where  # = input record length. Normally this is actual record length. Can be any number for program to use to calculate number of characters to copy. (If the -d and -D options are used together, the -r CANNOT be used. because the -r is mutually exclusive with the -d and -D options.)

-b #   Where  # = number of record to begin copying with  (The number 0 is the default, which is actually the first record.)

-c #   Where  # = the total number of records to copy. (-b and -c are matching options)

-d #   Where  # = displacement to start copying from (This option is used instead of -b if you wish to use actual displacement instead of record number.)

-D #   Where  # = number of characters to copy (Use this option instead of -c, if actual character count is not an even record number.) (-d and -D are matching options)

-e    Process the input as an ebcdic file and convert to ascii.

-f + name     Name of file containing random numbers

-l    Look at specific record numbers. (NOT available)

-s #   Where  # = sample every ‘n’th record


Related Programs

Split

Top