Diskcat


PURPOSE   OPERATION   16 & 32 Bit Versions   OPTIONS   COMMAND_LINES   RELATED PROGRAMS


Author: Dan Mares, info @ maresware . com
Portions Copyright © (1998-2008) Mares and Company, LLC
Phone: (770)242-6687 X 119
Last update: 04-25-2008
16 BIT version no longer Supported

top

Purpose

Diskcat, in its basic operation, will traverse the entire directory structure of your hard disk and create a listing (catalogue) of all files and/or directories on the disk.

It is designed to be used for investigative/forensic purposes by creating a catalog of files on hard or floppy disks. The output is a fixed length record which lends itself to importation into a database for further analysis or sorting. Also, any spreadsheet can easily import fixed length records.

In addition to creating a catalog listing, it has many options which can be used to enhance the forensic or E-Discovery process. For instance, creating a CRC or hash for each file. It can check the header of each file to determine file type or mislabled file. If the +H (upper case H) option is used, it will place in the output file ONLY those files which match the types of headers the user provided in the “header.fil”. As of August 2006, the user can also designate a "Category" to place each header in. The category option is extrememly useful for E-Discovery, in that the user can specify categories for every file type, or header type.

It can also search for: specific file types; files of specific dates or sizes; and can, in effect, be programmed to search for files meeting specific criteria. This operation can turn Diskcat into a “findfile” program.

When “cataloging” files from many disks or multiple runs, it can “tag” each output record with the specific label indicating the disk that contained that file. This leads to easier location of files at a later date when searching for them on different disks. The "label" is usually the unique serial number seen when you do a DIR command in the command window. However, the user can provide their own unique label. Do not mistake this Windows serial number with the physical serial number the manufacturer places in the disk firmware. They are not the same thing.

When run on an NTFS file system, the 32 bit version also has the capability of showing files with associated Multiple Data Streams (-s), and of showing the owner of the file (-u or -U username)

For each file listed it can execute a specific program or DOS command on that file. For instance, if the user asked Diskcat to locate all *.zip files it would list all those files with a .zip extension. Then the user could ask it to execute the PKZIP with the appropriate command line option to display the contents of all the .zip files found. This would effectively produce a listing of all the files contained in all .zip files found on the disk being examined. (see also the --ziplog option).

Other programs could be run for personal directory maintainance. Diskcat could locate files over XX days old, and then run the ms-dos del command on those files to clean out the disk.

A user-designed batch file could also be run on files selected by Diskcat, thus allowing the user to accomplish almost any operation on the file.

The fixed length, or delimeted output of diskcat can be used as an input to programs like upcopy or rm to perform specified maintanance on those files meeting the diskcat criteria.


Top

CRC or HASH

Another very useful feature is the -C option for CRC, or cyclic redundancy check and the ini option: HASH=ON or the --hash option.

This option is quite helpful when using the program to check for a corrupted file or program. The -C (CRC) option uses a CRC algorithm to produce a CRC hex value for each file processed. It then places this 8 digit hex value right after the file name. The 8 digits are always in the same location. Therefore, if the file name is a short one, the file name is padded with blanks to get to the CRC value. This CRC value is the same one produced by Crckit and Hash, and internally by those files producing a ZIP output.

This CRC can be used as a file verification or corruption check on seized files, or just for historical purposes. To see that files have not changed, you might want to conduct the following proceedure to verify that a file(s) has not been altered: Initially you create an output file containing the CRCs of known good files(original installations). Then, periodically, create a second output file containing current CRCs of the same files. Then compare the two outputs for differences on the CRC field. Any changes in the CRCs which would indicate something wrong with the file. This process can be implemented in any number of ways with other Maresware programs. It is described here as an example of the robust capability of diskcat.

When using the 32 bit version, be aware that using the -c (CRC) option will alter the last access time of a file. If you wish to replace the original last access time after the CRC is calculated, at that time you can either use the -R (reset file time) option, or set an environment variable RESET. (see Its About Time in Hash for a complete explanation.)


Top

Operation

Diskcat's default is to recurse the entire directory tree from its default directory (if you were at root, this means the entire drive), and to list every file to the screen. It produces an output listing which is normally in a three column format. The first column is the filename including the path (defaulted to 60 characters so it is easily viewed on the screen); the second column is filesize; and the third column is the file attributes. If an output is selected, the disk serial number of the source drive is added as a default fourth column as seen below(at Sample Output). If you want output different from the default, appropriate options must be set. Pay attention.

Path/filename width, and columns such as comment, CRC, date, time, disk “Label”, file type, file owner are added as the various options are chosen.

With the 32 bit version long filenames are handled effortlessly. To expand the path to allow for full display of the complete path/filename use the the -w option, or the --variable command line switch.

Top

Sample Output

Below is a sample of the normal output.(Filename length and spaces have been truncated in order to fit on the page.) Also notice the alternate data stream identifier for the junk file.

Disk serial number (DIR visible serial number) is provided if the Label (-i) option is used. If the -I (Label ) option is used, then the serial number is replaced by that user-supplied label.


Path/filename                   filesize  attrib  disk_serial_no.
D:\WORK\VER20\WINREL\DISKLABL 4 A....E 24F9-7921
D:\WORK\VER20\WINREL\DISKCAT.INI 350 A..... 24F9-7921
D:\WORK\VER20\WINREL\BASE.obj 28267 A..... 24F9-7921
D:\WORK\VER20\WINREL\OPTIONS.obj 29454 A..... 24F9-7921
D:\WORK\VER20\WINREL\FIXNAME.obj 2519 A..... 24F9-7921
D:\WORK\VER20\WINREL\sortfile.obj 3535 A..... 24F9-7921
D:\WORK\VER20\WINREL\diskcat.obj 36419 A..... 24F9-7921
D:\WORK\VER20\WINREL\EBC_ASC.obj 3727 A..... 24F9-7921
D:\WORK\VER20\WINREL\INIT_LIB.obj 3501 A..... 24F9-7921
D:\WORK\VER20\WINREL\DMP_FUN.obj 3248 A..... 24F9-7921
D:\WORK\VER20\WINREL\DISKCAT.exe 156872 A..... 24F9-7921
D:\WORK\VER20\WINREL\junk 10 A..... 24F9-7921
D:\WORK\VER20\WINREL\junk:alt.txt 15 ADATA. 24F9-7921

Top

Disk Labels

There is also an option (-I), for Identifier or insert, which provides for a literal tag or disk label to be added to the record. This option allows for an automatic labeling or a manual label input by the user. The automatic labeling is suggested when cataloging multiple disks in a forensic setting.

This labeling option is only allowed if you are using it in conjuntion with the -aO output option which creates/appends an output file containing the results of the program. The disk label is normally used when creating disk catalogs of numerous disks. You can provide a unique label (up to 9 characters long) for each disk. If the disk label ([-I label] option) is used, then the disk serial number is replaced by that label. It is suggested that all the disk labels, if used, be of the same length so that when the file is printed the disk labels all line up properly. (REMEMBER: if you are using this program to catalog many floppy disks, always use the -[aO] (append) options to cause the output file to be appended.). See also the COMMENT options.

Another alternative to keying in a separate (-I) disk label each time using the -I option is to use the lower case -i option. The lower case option is an automatic number incrementing option. The program must be run from a default hard disk directory. It then looks for a file called DISKLABL in the default directory. If it doesn’t find one it will create it. It picks up the 10 character ascii contents of the file DISKLABL; if none is there it starts the label numbers at 1001. This 1001 is used as the label to add to each record of the output file just as if you had keyed in -I 1001. The program then places the 1001 in the DISKLABL file and closes it.

Then, when the next disk is catalogued and the program finds 1001 (the last label number used) as the contents of the DISKLABL file it takes that 1001 and adds 1 to it to make it 1002. This 1002 then becomes the label to add to the output file records. It also replaces the DISKLABL contents with 1002 so the next time the program is run it will find 1002, and increase it to 1003 etc.

If, however, you wanted to start the numbering at a specific place such as a case number, or search site number, or alphanumeric number as labels, you should first create the DISKLABL file and place in it the ascii contents of the number you wish to start at less 1. For example, if you wanted it to start at MAR1001, the initial contents of DISKLABL should be MAR1000. The program will subsequently take care of the incrementing of the numbers. No provisions are made for the incrementing of the alpha section of the label. And the number part MUST be at the end.

The default disk label is the disk serial number (if no other label was chosen).


Top

16 and 32 Bit Versions

There are certain differences between 16 and 32 bit versions: (as of 2008, the 16 bit version is no longer supported.

FILE ACCESS TIME: Using any version of Diskcat with any of the following options: (-h, -z, +h, --hash or -c) will alter the last access date on an NTFS or WIN95 file system. This may cause an evidentiary problem for some investigations. (See It’s About Time in the hash.exe documentation for a full explanation.)

The 32 bit NT version can be set to replace the original last access time of the file if the -R option is used. (This can also be accomplished with an environment variable of RESET.) When running the program without one of the options that “OPENS” a file the last access date is not altered. (You can verify this for yourself before using it on evidence. The command <mdir.exe> with the -ta option can be used to verify last access times of files on NTFS.)


Top

Options

Diskcat is INI capable.

This program is INI capable. INI keywords here are in [BOLD, ALL CAPS].

All options should be preceded by a (-) minus sign (with the exception of two of the +hH options). Some can be grouped together, and others MUST be grouped without a space(they will be specified as to which style to use). The options are grouped where approriate. Some options conform to the *IX format of using a minusminus (- -) syntax. Where appropriate it will be identified.

Some options are only active in the 32 bit version running on an appropriate file system because they deal with specific 32 bit items like MDS (Multiple/Alternate Data Streams) or file times.

-p + path(s)    If more than one directory is to be looked at, then add the paths here as appropriate with spaces between each. (-p   c:\windows   d:\work). Default is to begin at the current defautl path if no path option is used. [PATH]=path

--path=single_path_to_traverse;  Only a single path is used/allowed in this - - option.

-f + filespec    If more than one file type is needed, add them here with spaces between each. (-f   *.c   *.obj   *.dll). Default is to find ALL files if no file option is used. [FILES]=filetype(s),one per line

If the above options are used, the program builds a matrix of paths and file types. It searches all the requested directories for all the requested file types, thus producing a total of all the files in all the paths requested. These options are added to any default command line provided.
(C:>mdir c:\work\*.c -f *.dll -p d: \windows)

--filename=single_file_type    Only a single filetype is used/allowed in this - - option.

-x + filespec    E(x)clude these file types from listing (same format as -f option) (-x   thesefiles.txt) [EXCLUDE]=filetype

--exclude=single_file_type_to_exclude    Only a single filetype is used in this - - option.

-oO + path/filename    Output file name: place the output to a filename. If uppercase ‘O’ then existing output is appended to. The special output option -ostdout should be used if you wish to redirect the output to another file or directly to a printer. This option (-ostdout) is very specialized and may not work with some other options or other programs. [OUTPUT]=filename

--output=outputfilename    Same as above except output is always appended to (-a).

-[oO] + MMDDYY:     causes the output file to be named with todays date. Used in batch files to get automatic output naming

-V    Output records are variable length. This guarantees that the full path is included. Also inserts pipe delimeters (-d "|") by default. Mutually exclusive with -w(idth) xx option.

-w + #   Change the default width of the filename from 35 to whatever value you wish. If you have long filenames, this may be necessary to accommodate the entire name. If a filename longer than 35 is used, the output tends to be more than one line long. (-w 250) [WIDTH]=50

--width=value   Same as -w (--width=250)

-a      Append output to filename provided in -o option. Serves same purpose as using an upper case O. (-a) [APPEND]=[ON|OFF]

-C + "comment"  Add a "comment" to the beginning of every record. This is very useful when ultimaely merging many outputs from different locations or for different cases. The comment can uniquely identify the sources of the hash values. Example, (-C SUSPECT_CPU#1). The resulting output records would look something like this: "SUSPECT_CPU#1 C:\WINNT\....\filename etc."

-C + COMPUTERNAMExx  A special version of the -C option. If the literal COMPUTERNAME (all uppercase) is used, then the program will find the name of the computer and insert it there. This is kind of like a wildcard subsitution. The user can let the system decide what to put there. This can then uniquely identify the source computer of where the data record comes from. Example, (-C COMPUTERNAME). The resulting output records would look something like this: "CPU-2_ATLANTA C:\WINNT\....\filename etc.". If the xx is replaced by a numeric value, then the computer name field is made this many characters wide. (-C COMPUTERNAME20) becomes: "CPU-2_ATLANTA          C:\WINNT\....\filename etc.". Suggest using the xx value so that each record still maintains a fixed width for the comment field.

-v;  No 'V'erbose. Do not print headers/footers to output file. Otherwise, a header and foolter are placed in the output file, making it cumbersome to import or massage with other software.

-1 + path/filename   (That's a one, not an ell). The filename here is a file which will contain accounting/log information about the run. It is always appended to, and contains the command line plus statistics about how many files and time of run. The file can later be used as a batch file for duplicating the runs. The ACCT environment variable can also be set. (SET ACCT=logfilename). Or use the .INI option [ACCT=filename] The order of priority is: Environment, INI file, Command Line option. To explicity turn it off use a +1, and NO accounting logging is maintained.

--memo    Causes an interactive dialog with user which allows user to input up to 2000 characters of "memo" information. The user is first asked for the name of a memo file to open and add the characteres to.

--memo=memofilename    Creates/Appends a file called memofilename, and causes an interactive dialog with user which allows user to input up to 2000 characters of "memo" information.

-s    Do Not list Alternate Data Streams. (NTFS only). [STREAM]=[ON|OFF]

-u    NTFS only. Display owner name of the file.

-U ownername;  NTFS only. Display only files with this ownername.

-g + #    Where the # is replaced by a number indicating: list all files ‘g’reater than # days old. You can use a -g xx -l yy pair to bracket file ages. [OLDER]=50

-l + #    (ell, not one) Where the # is replaced by a number indicating: list all files ‘l’ess than # days old. You can use a -g xx -l yy pair to bracket file ages. To get todays files, use (-l 1) [NEWER]=10

-g + mm-dd-yyyy
-l + mm-dd-yyyy[acw]
:  (that's and ell, not a one).
Process only those files (g)reater (older) than or (l)ess than (newer) than this mm-dd-yyyy date. The date MUST be in the form mm-dd-yyyy. It MUST have two digit month and days (leading 0 if necessary), and it MUST have a 4 digit year. The date given mm-dd-yyyy is NOT included in the calculation. Ie. if today was 01-10-2003 and you entered -l 01-09-2003 you would only process todays files. If you wanted to include those on 01-09, you should have entered -l 01-08-2003.

The [acw] literals, choose which time to base the mm-dd-yyyy test on. Any or all [acw] can be used. If none used, then default is w

 examples:
-l 10-01-2005w (newer than) -g 12-01-2005w (older than) (files between 10-01 and 12-01-2005
-l 12-31-2005c (newer than) -g 01-01-2007c (older than) (files with 2006 dates) 
-l 10-20-2005acw, 
-g 12-05-2005wc

--newer=mm-dd-yyyy   --newer=01-01-2005
--older=mm-dd-yyyy   --older=12-31-2005
Files older and/or newer than these dates. This format is usually easier for people to comprehend.

-L + #    Where the # is replaced by a number indicating: list all files less than # bytes in size. (-L 100000) [LESSTHAN]=100000

-G + #    Where the # is replaced by a number indicating: list all files greater than # bytes in size. You can use a -GL pair to bracket file sizes. (-G 10000) (-G 10000 -L 100000) [GREATER]=10000

-P   
--pause  Pause after every 20 lines. Only useful if displing to the screen and not using the -O output to a file.[PAUSE]=ON

-d + delimiter    Replace “delimiter” with a delimiter (typically a pipe ‘ |’ ) within double quotes with which to delimit fields. If the delimiter is not printable, use its decimal ascii value but don’t place it it quotes. (-d “|”) [DELIMITER]=|

-t[acw3]    Show the file time as last ‘a’ccessed; last ‘w’ritten(modified); ‘c’reated; or show all ‘3’. No spaces between the -t and the modifier. ( -tc or -t3 ) Default is the ‘w’rite, which is identical to what DIR or Explorer displays. Note: The 3 file time capability is only available under 32 bit operating systems using the 32 bit version of the program. If the -t is uppercase, -T, then the date is printed in YYYY-MM-DD format. The default is MM-DD-YYYY. If the [acw] is uppercase [ACW], then seconds are added to the time field.   [TIME]=[A|C|W|3], [ALLTIMES]=]ON|OFF]

-z    If using the 32 bit version, display time in ‘Z’ULU GMT format. The letters GMT will be at the end of the output line indicating such. Use GMT to get relative references especially when dealing with 2 or more time zones. (-z) [ZULU]=[ON|OFF]

-A[ehrsmdD]    Show only files with the following attributes: h=Hidden files, r=Readonly, s=system, d=directories only, m=modified, e=encrypted filesystem (NTFS 2K). The [hrsdm] must be entered immediately after the -A without any spaces. The -A is case sensitive. [HIDDEN|READONLY|SYSTEM|ARCHIVE|DIR_ONLY|ENCRYPTED]=[ON|OFF].

The differences between the -d and -D are that if the upper case -D is used, then ONLY directories are listed in the output. If the lower case -d is used, then directories are added to the output file and the -r (recurse) option MUST be used. (This is somewhat different than the way the Mdir program uses the -AD or -Ad options.)

-R  
--reset=[ON|OFF]  RESET the last access time to the original time. This reset is attempted after using an option that opens a file for reading. All files except those LOCKED by the operating system are reset. This same effect can be achieved if an environment variable RESET is set. (set RESET=1). This option is only available on the 32 bit version.

-c    Create a CRC32 checksum for each file and append at end of filename (alters last access time on system and read-only files; normal files have last access time re-set to original). It is also possible to create a 128 bit MD5 hash of the files. To do this the INI file must be used, and the keyword HASH=ON must be in the init file, or --hash command line.

-eE “command %”    See EXEC -e option description below.

NOTE: For below header stuff, the file containing the headers has a filesize limitation of 50000 bytes or 500 LINES (including comments), whichever limit is met. This limitation was imposed because occasionally the header files being provided were corrupted and would cause the program to incorrectly execute. The limitation is designed as a safety factor in case the user provides a file which is not compatible with the program.

(Note: All -h operations alter last access time on files.)

The 'H' options, outlined below, can be very confusing, and produce somewhat unexpected results. Please check your logic before putting into production. See the section on headers in the Headers section for some examples and further definitions. The usual Header option is the +h option.

+h + header_filename     Compares items in filename with headers of every file on disk. See description of “file headers” below. Shows file extensions of ALL (EVERY) file on the disk as the program believes the file to be based on information in the header file provided. This option produces a list of every file on the disk. (Note: this operation alters last access time on files.)

The ini setting of CATEGORY=ON can be used to refine the output record to include the user defined CATEGORY of the file. See the format of the header file below.

+H + header_filename   Compares items in filename with headers of every file on disk. See description of “file headers” below. If the file type matches one of the header types (i.e., is a file of that type) then the program outputs that file's information. This option outputs ONLY those files whose headers match those you supplied in the reference file. Use this option to selectively find specific file types for additional processing.

-h + header_filename   Similar to +h option. The program attempts to determine the file type of each file. It outputs a record for every file, but fills the file type field ONLY if the extension does not match those in the list supplied. All files whose extension match the file type are listed with a blank in this extension field. To find mismatched files, simply look in the extension field for data. (Note: this operation alters last access time on files.)

The header_file should contain as many headers as the user has available. The more headers provided, the better the chance of determining the file type. Contact Mares and Company for file headers. The program can only identify those headers that the user has supplied. So be careful and make your list as accurate as possible. Different header files can be used depending on the type of files searched for.

-H + header_filename    This is probably the hardest to understand and design for. The file types are checked against the header file list. ONLY those whose extension is mismatched is output. Use this to select ONLY those mismatched files. This should give the smallest output if the header file is complete and accurate.

 

-i    Use the automatic label numbering procedures, and create/modify the file called DISKLABL. The numbering is designed to start at 1000. If you want it to start at 1001, then initialize the file DISKLABL to 1000.

-I + label    The disk_label can be up to 8 characters which will be prepended to the path.

-8:  Add the DOS 8.3 filename to the end of the record.

-88:  Add the uppercase Long File Name to the end of the record. This option strips the LFN from the path listing of the first field, and places only the LFN at the end of the record. The default length is a 75 character field. (Note: the -8 and -88 options are mutually exclusive. Use one or the other).

-88xx:  Replace the xx with a value. This value will now determine how wide the Long File Name field will now be. Use this to reduce the size from 75 to some other value. The default length for -88 is 75 characters.

--ziplog:  When a zip file is encountered, check its internal directory/contents and add these records to the output file listing. The zip files are identified by the PK header. Because the files must be opened to read the contents and check to see if they are zip files, the -R (reset) option is always set with this option, and can't be turned off. The directory contents of the zip file are included amongst the normal output records. Since a significant amount of the normal file processing may not be conducted on the contents (zipped files) of the zip file, many of the output fields with this option are left emtpy. For instance, zip file contents do not maintain create or access dates, so those columns are left blank. Hashing, CRC is not done, and header check on the contents are not allowed. (09/2007) [ZIPLOG]

--ziplog=ziplogfilename:  Same as --ziplog except that if the =ziplogfilename is added, the contents of the zip files is placed in a seperate ziplogfilename file, and not intermixed with the normal output.(09/2007) [ZIPLOG=ziplogfilename]

-5:  Add an MD5 hash field. Same as INI file: HASH=ON

INI Settings:

Following is a sample diskcat.ini file with most, if not all, the approprate keywords that diskcat will recognize.

The INI settings that can only be set from the ini file are:
CATEGOREY=ON    ; This installs the category column from the header file
SPLIT=xxx                 ;Set output file record counts to xxx maximum records per file. (ie: SPLIT=30000) Use this when intending to import the output to a spreadsheet with a maximum record limit.
HASH=ON                 ; Turns md5 hashing on for each file. MD5 value is placed before time fields.

The file is shown as all comments, so you can cut and paste from here.

CATEGORY=ON
;CATEGORY is only available in the .ini file.
SPLIT=xxx
;SPLIT is only available in the .ini file.
HASH=ON ;Turn on md5 hashing RECURSE=OFF
files=*.exe
paths=d:\work
output=d:\tmp\junk
older=15
younger=180
lessthan=10000
greater=1000
width=45
delimeter=|
military=on
time=c
alltimes=ON
zulu=ON
stream=OFF
archive=ON
readonly=on
hidden=ON
system=ON
DIR_ONLY=on
directory=on
CRC=on
FIXED=ON
label=labelname
OWNER=ON
SORT=s
ziplog
ziplog=ziplogfile.txt


Top

File Headers

The [[+-][[hH] + filename] option allows you to provide, in an external text file, a list of standard extensions of files (exs., exe, wp, dbf, gif, etc) and the string of characters that should be found in the header of the target file--if, in fact, that target file is of the type referenced by the extension.

For instance: a program .exe file should have as its first two characters in the file an MZ; a pkzipped file should have a PK as part of the file header.

Setting up the reference.fle

The text file containing the reference extensions and headers will be referred to here as "reference.fle." This file should be set up in the following manner:

One line for each file type indicated, and it is case dependent.

The reference.fle should be created with an ascii text editor. No word processor formats are recognized. AFTER THE LAST LINE, AT LEAST ONE BLANK LINE SHOULD BE ENTERED. Maximum of 100 lines/file types to test for.

The lines consist of 3 or 4 parts. Each must be in the correct format and location for the program to work.

part 1: the category you wish to place this header in. ie: it could be DOCUMENT, PROGRAM, GRAPHIC, SPREADSHEET, or any category word you wish. This is strictly user defined. This text will be placed in the output record, if the CATEGORY=ON trigger is included in a diskcat.ini file.

part 1A: a comma , follows each part.

part 2: The "TRUE" expected extension you expect to see on the file (ex., exe wp gif). No leading period is allowed.

Part 2A: (optional section) a colon (:) followed by a number. (SEE NOTE BELOW).

part 3: a comma (,). This will separate part 2 from part 3.

part 4: header string.

If the first character of the line is a # (pound sign) or a ; (semi colon) this line is completely ignored and is considered a comment.

#exe,MZ    This is a comment line.

NOTE: If the expected header signature (ex., Pklite) is located at some position other than the 1st position of the file, then add a colon (:) followed by the byte location (displacement) into the file where the header signature is expected to be found. An example for a 16 bit self extracting PKZIP file would be (zip:66). The same self extracting zip file created under WINZIP32 commercial version would be (zip:136)

COMPRESSED,ZIP:136,XD39128360000000000000000E0000E010B01041400  

This is the signature for that WINZIP32 bit self extracting executable.

The header string consists of the string of characters that should be looked for to determine if the file in question is the type of file referenced in part 1. Since this string is taken as a literal, it should not have any spaces anywhere within it except those spaces that should be considered as an actual part of the file header.

If you wish, this header string can be a hex value. In this case it must begin with an ‘X’, and the hex values must be each 2 characters wide. Use this if you cannot easily input the values with an ascii editor. Ascii header strings, and hex headers strings can be used on different lines in the same file.

Below is a sample header file. Notice that the first line is in a different format (as described above).

Sample reference file:

COMPRESSED,ZIP:136,XD39128360000000000000000E0000E010B01041400 zip,X504B
PROGRAM,exe,X4D5A
ENCRYPTION,pgp,X84
PROGRAM,com,XE8
PROGRAM,bat,@echo
PROGRAM,bat,set
PROGRAM,bat,SET
GRAPHIC,gif,X47494638
GRAPHIC,jpg,XFFD8FFE0
GRAPHIC,pcx,X0A050101

Notice that the compressed zip header (1st line) was placed before the exe header. This is because, had the exe header come first, the program would have indicated an exe file and would have never gotten to the self extracting zip header. And that the category for that file was compressed, rather than program. This is so in the output, it will be evident that it is a zip file, not a true executable.

Because the header list is checked in the order it is found in the header file, you should place the most restrictive file types first in the header file. An EXE file should have an MZ as its header. Let's take a case where another type of file had a header of MZH. If the EXE,MZ line came first in the header file, then the MZH file would produce an incorrect output. So put the MZH line first in the header file. This becomes important with files containing possible database headers like DB or DBASE.

If it is not a correct extension, the program prints as the 1st three characters of the output the reference extension found in the reference.fle thus indicating what the extension SHOULD have been.

Here is an output without using any of the header options. It just shows what files are there. The .uni files are true microsoft unicode files. All others are true as shown, execept the .exz file is really a misnamed executable.

D:\TMP\junk.uni             2000 ..R.. 
D:\TMP\lesson.uni             30 ..R.. 
D:\TMP\COKE_ALL.jpg        20130 A.... 
D:\TMP\COKE_2.jpg          15203 A.... 
D:\TMP\CLEANUP.BAT            98 A.... 
D:\TMP\DISKCAT.EXE        135368 A.... 
D:\TMP\HEADERS.HEX           128 A.... 
D:\TMP\OUTPUT                  0 A.... 
D:\TMP\diskcat.exz        135368 A.... 

SAMPLE OUTPUTS for reference file above:

(1)Same output using this command line: diskcat -h headers.hex (list EVERY file, but only SHOW true headers of those with mismatched names). Notice the .uni and .hex extensions are unknown extensions as listed in the reference header file.

D:\TMP\junk.uni             2000 ..R..  UNK 
D:\TMP\lesson.uni             30 ..R..  UNK 
D:\TMP\COKE_ALL.jpg        20130 A.... 
D:\TMP\COKE_2.jpg          15203 A.... 
D:\TMP\CLEANUP.BAT            98 A.... 
D:\TMP\DISKCAT.EXE        135368 A.... 
D:\TMP\HEADERS.HEX           128 A....  UNK 
D:\TMP\diskcat.exz        135368 A....  exe 

(2)Same run but with the command line: diskcat-H headers.hex (ONLY MISmatches are output.) This run is based solely on the list in the header reference file. So, since the .uni and .hex files are not even listed as a valid header, they are not checked. However, the exe header is listed in the reference file, and a misnamed file was found, so it was listed.

D:\TMP\diskcat.exz 135368 A.... exe 24F9-7921

(3)Same run with the +h headers.hex. Show extensions of EVERY file. If the header is not listed, it is displayed as an UNK(nown) This would probably be the default run for any catalog list. Then sort on the field containing the type of file so you have a neat list sorted in file type.

D:\TMP\junk.uni             2000 ..R..  ASC 
D:\TMP\lesson.uni             30 ..R..  UNK 
D:\TMP\COKE_ALL.jpg        20130 A....  jpg 
D:\TMP\COKE_2.jpg          15203 A....  jpg 
D:\TMP\CLEANUP.BAT            98 A....  bat 
D:\TMP\DISKCAT.EXE        135368 A....  exe 
D:\TMP\HEADERS.HEX           128 A....  ASC 
D:\TMP\diskcat.exz        135368 A....  exe 

(4)Same run using the final +H option: Diskcat +H headers.hex ( ONLY show those files whose header is matched in the list.). This option is good to identify specific file types on the drive. You might have a header list of only graphic headers, so the list will only show graphic files. Notice that only those type files where there was a known signature in the header file were output. NONE of the UNKnown types were listed.

D:\TMP\COKE_ALL.jpg        20130 A....  jpg 
D:\TMP\COKE_2.jpg          15203 A....  jpg 
D:\TMP\CLEANUP.BAT            98 A....  bat 
D:\TMP\DISKCAT.EXE        135368 A....  exe 
D:\TMP\diskcat.exz        135368 A....  exe 

          *** ALL -hH options alter last access time. ***

IF the CATEGORY=ON is found in the diskcat.ini file, then the additional category field is included.

D:\TMP\junk.uni             2000 ..R..  ASC      TEXT
D:\TMP\lesson.uni             30 ..R..  UNK   UNKNOWN 
D:\TMP\COKE_ALL.jpg        20130 A....  jpg   GRAPHIC
D:\TMP\COKE_2.jpg          15203 A....  jpg   GRAPHIC 
D:\TMP\CLEANUP.BAT            98 A....  bat   PROGRAM 
D:\TMP\DISKCAT.EXE        135368 A....  exe   PROGRAM
D:\TMP\HEADERS.HEX           128 A....  ASC      TEXT 
D:\TMP\diskcat.exz        135368 A....  exe   PROGRAM 
Top

EXEC (-e) OPTION

The exec enhancement uses a command line option to execute either a DOS internal command (exs., copy, del, dir) or a program. The term 'command' will be used in the following discussion to mean both program and DOS command.

The -e or exec option is most effective when used in conjunction with options that can identify certain selected files to perform the command on. It works in a similar fashion to the -f option. As described above, the -f option locates, on the disk, those files which meet certain filename criteria (ex., *.bat).

When a file is located (under whatever option is ultimately used) the filename is passed to the command requested by the exec option. An example would be to use the type command to look at all the *.bat files on the entire disk. Or to do a dir on all the directories located in a specific path or a dir on all the files over a certain number of days old.

The format of the exec option is as follows:

-e  “command %”

The -e (e)xec option is used to execute “command” on the file(s) ‘%’ found").

The actual syntax is:

-e  “command  [arguments] % [arguments]”  

Where:

    the -e is the actual option. If a lower case -e is used, then the entire filename including path is substituted for the %. If an uppercase -E is used, then only the filename is substituted for the %. The quotes around the rest of the option syntax are mandatory. This is so DOS will hold the entire item and pass it as one string to the program.

   command is actually replaced by the command you wish to run.

   the arguments:  are any additional filenames or options needed for the command chosen, and the % is positionally placed at the location where you want the program to place the name of the file it finds. The % is positionally sensitive and should be placed in the exact location where the selected file would have been placed in the chosen command.

For example: A command to do ‘dir’ on all ‘.bat’ files in ‘c:\sample’ path would look like this:

diskcat  -f  *.bat  -p  c:\sample  -e  “dir %”

Notice the retention of the quotes(“).

For example: A command to zip and add to a output.zip file all *.bat and maintain their appropriate path would be:

diskcat  -f *.bat  -e  “pkzip  -ap output.zip %”

NOTE: If the command used is NOT a DOS internal command and is instead a program the program SHOULD be a .exe executable and reside on a subst drive letter of x: This is because Diskcat normally ONLY looks on drive x: for .exe programs to run. If it cannot find the program there, it assumes it is a DOS internal and attempts to run a DOS internal. In some instances it will run programs located in the DOS path. If you are attempting to run one of these, try it first to see if it will operate correctly. You might also try entering the program name as complete path and name with proper extension (.com .bat). This may provide more reliable results if you completely path the program name. (ie.:  diskcat -f *.bat -e “d:\work\run.bat  %”)

SPECIAL ZIP CAPABILITY

This section deals with a special implementation of the -e execute command when you have zip files located in directories, and wish to extract ALL the files located in the zip files in the correct locations. The zip files could have been placed there by the upcopy command, or FTK, or any other program to move zip files to a specific location.

The user MUST have access to a command line version of pkzip. The current version I have identified as pkzip32, indicating it is a full 32 bit long file name version.

The additional commands to add to the -e command is an upper case P directly after the -e, indicating that a PATH is to be inserted somewhere in the command line. This is needed for PKZIP to know where to run the command from.

After the -eP, you use a similar syntax to the basic -e option,except you add a cd command, for change directory. And you put a placeholder -PATH in the command line where you want the program to insert the path to use. This is sort of a wildcard replacement.

The last item, is to provide the correct command line syntax for the OS to change to the -PATH directory, && (and) execute the pkzip program. The full command is below, and the syntax should be followed exactly. You can modify the specific pkzip options, but those listed should extract all contents, in appropriate folders.

This is the command line:
C:>diskcat -f *.zip -eP "cd -PATH && pkzip32 -extract -directories -recurse % -overwrite"

The -eP says we are going to use a path to change to
The cd -PATH is the trigger to tell the program to perform that cd operation
The % is the usual replacement of the filename, which will be a zip filename.


Top

Command Lines

diskcat
/*lists all files on default drive to screen*/

diskcat -?
/* obtain help screen */

diskcat  -o outputfile
/*lists all files to output file called utputfile */

diskcat  -a  -o outputfile
/*append output to existing output file */

diskcat  -O outputfile
/*append output to existing output file */

diskcat  -p d:\work\
/* start search at this directory */

diskcat  -o output -p a:\ -I 1001
/* create a label of 1001 and place the output to output */

diskcat  -O output -p a:\ -I 1001
/* this will append */

diskcat  -i -p a:\ -O d:junk
/* create automatic label of a: with automatic append */

diskcat -p a: +h headers.hex
* check drive a:, and compare headers in headers.hex */


Related Programs

Crckit

Hash

Hashcmp

Top