HASH

PURPOSE   OPERATION   ITS ABOUT TIME   OPTIONS   COMMAND LINES   RELATED PROGRAMS Processing Stats


Author: Dan Mares, info @ maresware . com
Portions Copyright © (1998-2008) Mares and Company, LLC
Phone: (770)242-6687 X119
Last Update: Aug. 10, 2007

PURPOSE

The program HASH.exe is designed to calculate the 160 bit MD5 hash total of a file using the MD5 Message-Digest Algorithm from RSA Data Security, Inc. Depending on the options chosen, the user can bypass the hashing calculation, thus providing a default catalog of every file on the disk, or it can also calculate the 32 bit CRC (CCITT) or any of the SHA (Secure Hash Algorithm) algorithms. (160, 256, 384 and 512 bit calculations.)

MD5:

Searching any one of these, and many related sites will give insight as the implementation and reliability of the MD5 algorithm.

http://andrew2.andrew.cmu.edu/rfc/rfc1321.html
http://www.columbia.edu/~ariel/ssleay/rfc1321.html
http://www.kashpureff.org/nic/rfcs/2200/rfc2202.txt.html
http://www.cs.auckland.ac.nz/~pgut001/cryptlib

These link(s) are excellent research pages, and included just for informational purposes.

http://ciac.llnl.gov/ciac/CIACHome.html

SHA-1:

The NIST recognized SHA-1, and SHA-2 (256, 384, 512) Secure Hash Algorithm has also been implemented. Use of the (-s, -256, -384, -512 or -B) option will produce various SHA calculations instead of the MD5. The SHA calculation is the only secure hash algorithm currently recognized by NIST.

More information in the SHA algorithm and certification can be found at: http://csrc.ncsl.nist.gov/cryptval and http://csrc.nist.gov/cryptval/140-1/1401labs.htm

SHA-2:

Hash also currently supports NIST SHA2 versions of the Secure Hash Algorithm. There are three versions of the SHA2. There are 256, 384 and 512 bit versions. These options are appropriately implanted as: -256, -384, and -512. When using these options, the -s option may also be used, to get a full range of SHA values. A little bit of overkill.


SHA2 Copyright:

The SHA2 code implemented in this program was modified from code written by:

AUTHOR:Aaron D. Gifford <me@aarongifford.com>
Copyright (c) 2000-2001, Aaron D. Gifford All rights reserved.
Redistribution and use in source and binary forms, with or without modification are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTOR(S) ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.


By default the HASH program produces an excellent fixed length output record of the entire file listing (catalog) of a disk drive. This is useful for cataloging files on drives. Delimeters can be inserted (-d option) between the fields of the output record so importation into wannabe data bases can be achieved.

Hash can calculate the hash value for a single file, for files in an entire directory, files in an entire path, or files on an entire logical drive, or drives. Specific file types can be excluded from the calculation with the -X  option.

The calculation of hash values of files have a number of different uses.

The hash of a file can be used as a verification of the state of a file at a certain time. Similar hash values mean the files are identicle. Different hash values mean the files have differences. These similarities or differences can have uses in forensic verification, virus detection, file authenticity and others. Some people use a hash library to see if a file is the same as its original schrink wrapped version.

Some processing statistics:

These stats below are relatively old, to take them with a grain of skepticism. As of 8/2007, on a top end computer, running a 133 mhz PCI drive bus, hash will process about 2 GIG per minute. On a top end SATA drive it goes up about 50%. Any statistics you read should be confirmed on your own setup as sample runs can be tweaked to make any software look good.

The statistics shown here are for our computers. Your setup may not experience the same speeds, or may get better results. This is only a sample of what we have available.

On current state of the art CPU's with speeds in excess of 3 GHZ, and SATA drives, we have gotten over 2 GIG/Minute throughput on large files. These results are much better than the statistics below reflect. I've just not had time to redo those runs on current hardware.

Users are encouraged to perform their own tests on the hardware which you will actually be using.

Most of the tests were done on a 2.8 GHz CPU. The large (2.8G) file (yes it was on an NTFS partition) was on a 7200 RPM IDE drive, and the test of the 13,000+ files (total size approx 2.8 gig) was done on a 7200 Wide SCSI drive.

The runs were done to see if performing the MD5, SHA-1, SHA-2(256, 384, 512) calculations would have a significant effect on processing times. Most of the runs were done with software that would perform both the MD5 and SHA-1 calculation at the same time. There are a few programs that would perform all 5, and some that would only perform 1 at a time (the sha4labs and md5 are two that apparently only perform one calculation, either the SHA1, or MD5.). However, in most situations, it was found that adding the SHA1 to the MD5 didn't amount to any significant increase in speed.

The table of results are below. (these tests were run September 2004). By the time you read this, the CPU speed may have increased significantly. Don't forget, everything is relative.

The programs were from the following persons or entities:
md5deep written by Jesse Kornblum can be found at: http://md5deep.sourceforge.net
2hash is written by Thomas Akin of ISS and can be found at: http://crossrealm.com/2hash/
sha4labs is an older program from the Netherlands Forensic Institute. (I couldn't find a current distribution site)
fsum is from slavasoft.com at: http://www.slavasoft.com/fsum/overview.htm
md5 is from sandersonforensics at: http://www.sandersonforensics.co.uk. It is a purely GUI program, and the timing was a little difficult to determine.
hash and sha_verify can be found here are maresware.com.
If I have the authorship incorrect on any of these programs, please let me know.

Processing times are not easy to compare. A factor that is usually not considered is the combination of number of files and total file size. A single 3 GIG file can be processed significantly faster than 13000 normal files (because of file open/close overhead). Once reviewed, this seems like common sense. The other factors which have a significant role is the CPU and hard drive speed. Some calculations were done on two different CPU's to get a feel for the effect of CPU speed. It is almost obvious that you would want to use the fastest CPU you can get to perform the processing. At that point, it comes down to hard drive access times. The primary tests were done on a CPU where it was much faster than the hard drive speed. This operation effectively takes the hard drive speed (bottleneck) out of the equation. These are the runs on the 2.8 GHz CPU.

A simple batch file to test the time of these programs can be found at: ftp.dmares.com/pub/batch_files/test.bat
You will obviously need the software. Most of it should be available at the sites listed above.

Program Name No of files File Size (approx) CPU Speed HD Speed Time
* == only the MD5 algorithm was run.
** == all 5 algorithms were run (MD5, SHA-1, SHA2 (256, 384, 512)
*** == only SHA-1
hash 1 2.8G 2.8GHz 7200 IDE 1 m. 36 s.
2hash 1 2.8G 2.8GHz 7200 IDE 1 m. 31 s.
md5deep * 1 2.8G 2.8GHz 7200 IDE 1 m. 32 s.
fsum 1 2.8G 2.8GHz 7200 IDE 1 m. 32 s.
sha_verify 1 2.8G 2.8GHz 7200 IDE 1 m. 34 s.
sha4labs *** 1 2.8G 2.8GHz 7200 IDE 1 m. 32 s.
md5 * 1 2.8G 2.8GHz 7200 IDE 2 m. 00 s.

the following were the only programs easily recursed
hash * 13,430 2.8G 2.8GHz 7200 WSCSI 3 m. 11 s.
hash 13,430 2.8G 2.8GHz 7200 WSCSI 3 m. 41 s.
hash ** 13,430 2.8G 2.8GHz 7200 WSCSI 7 m. 03 s.
md5deep * 13,430 2.8G 2.8GHz 7200 WSCSI 2 m. 25 s.
fsum 13,430 2.8G 2.8GHz 7200 WSCSI 2 m. 41 s.

The following were run on a different setup, (slower CPU). The times did not increase significantly, possibly indicating that the drive speed is still a factor.
hash 1 2.8G 1.2Ghz 7200 IDE 1 m. 52 s.
2hash 1 2.8G 1.2Ghz 7200 IDE 1 m. 31 s.
md5deep * 1 2.8G 1.2Ghz 7200 IDE 1 m. 20 s.
fsum 1 2.8G 1.2Ghz 7200 IDE 1 m. 29 s.
sha_verify 1 2.8G 1.2Ghz 7200 IDE 2 m. 12 s.
sha4labs *** 1 2.8G 1.2Ghz 7200 IDE 1 m. 31 s.

That concludes our tests


Program Output:

The output record is normally (unless modified by the user) a 160 character record. I am telling you this because I can't tell you how many users run an output file, then open it with an editor and call and say, I get no MD5 value. My suggestion is look to the right of the screen. Here is a sample output (wrapped at 80 characters) for your information. The bolded item is actually one output line of 160 characters.

**************************************************************
Program started Wed Apr 12 13:52:19 2000 GMT, 09:52 Eastern Standard Time (-4)
c:\utils\ntutils\HASH.EXE wsplit.hpj -o \tmp\junk -------- BEGIN PROCESSING MD5 ----------- D:\TEMP\helpstuf\WSPLIT.HPJ 2DA1B0C315D7D92B42DD3F13B82D5704 173 04/09/1996 06:06w EST -------- END PROCESSING MD5 -----------

Processed 0 directories, 1 files, 173 bytes:

Elapsed: 0 hrs. 0 mins. 0 secs.

*****************************************************************************

Processing NOTE:

When using the -O or -a (append to an existing output file) the lines that begin with

"-------- END PROCESSING MD5 -----------"

and the statistics on the bottom of the page are removed so the additional hash values can be added. Because of this, the final processing statistics


Processed 0 directories, 1 files, 173 bytes: 
Elapsed:  0 hrs. 0 mins. 0 secs.

will only reflect those for the current run. I do not attempt to keep a running total of the number of files (entries) in the output file. It is an easy matter to figure out how many entries are in the output file, just by opening it with a good text editor, and look at the line count.

The output of the program is intended to be placed in an output file for future reference such as verification that files were not altered. This is important when certifying that file contents were not altered during forensic examination or duplication for analysis.

If a files contents was altered in any way the hash value calculated would be different from the original. The MD5 algorithm has been reviewed and tested by cryptologists and is one of the most secure. Security in this context means that no two files will ever produce the same hash value.

For documents describing the operation and reliability of the MD5 algorithm a search of the World Wide Web for MD5 will provide hundreds of sites and documentation.

The MD5 algorithm produces a 128 bit value (16 bytes, 32 printed HEX values) which guarantees (2 **128 or roughly 10 **38 ) no two files will produce the same value.

The SHA_1 algorithm produces a 160 bit balue. (20 bytes, 40 printed HEX values) which is a NIST certifiable algorithm. This alogithm produces unique values which guarantees file uniqueness.


Top

OPERATION

Even though HASH is a 32 bit program it MUST be run from the command line. It will run under any of the current Windows operating systems, and there is also a Linux version that provides a virtually identicle output format.

The user provides HASH with appropriate options on the command line. Hash can run from a batch file which means, for forensic purposes it can run unattended.

Run without any options,

(C:>hash)

HASH defaults to calculate the hash values of all files in the current default directory, and all sub-directories.

The user supplies various options to modify or enhance the program operation.

If no file type is provided, the default is all files (-f *.*). If no path is provided, the current default directory (-p .) is used as a starting point, and a recursive hash is done from there. Options are available for modifying how the program searches for files.

Depending on the options supplied by the user, the program can calculate the hash of a single file

(C:>hash anyfile)

or all files in a single directory

(C:>hash -p c:\this_dir -r)

or recurse an entire disk drive.

(C:>hash -p c:\)

Hash can also search for specific file types (i.e. *.exe, *.bat), or search down selected paths. More than one file type, and more than one path can be used at once.

(C:>hash -p c:\this_dir c:\that_dir -f *.exe *.bat)

The file types and paths provided by the user on the command line are used to build a matrix which HASH uses to select files. If more than one path and/or file type is listed, hash builds a matrix and incorporates all the requested file types into the search in each path.

After HASH has determined it has enough inforation, it proceeds to find all the files requested and to calculate either the MD5, 32 bit CRC or SHA of the file. It then prints the values on the screen. If an output file was requested it writes to the output file. HASH does NOT write to the hard disk unless specificially requested by the user to create an output file.

The space alloted for the output is generally maintained at a default of 40 spaces to accomodate the largest SHA-1 output. This means that if the CRC was asked for, there is a lot of empty space in the output record.

Whatever output is chosen, the chances of two dissimilar files producing the same calculated values is slim to none. Both the 128  bit (MD5 hash) and the 32 bit Checksum are secure. The 32 bit checksum will produce duplicates about 1 in 4,000,000,000. The 128 bit is not worth mentioning. None of us will live that long. (Actually the chances of a duplication are 2 **128 which is roughly about 10 ** 38); and the SHA will be 2 ** 160th which is astronomical.

The output records are fixed length records that can be imported into a data base for reference and cross matching with a later generated output. The headers must first be removed for this to occur. Or the program can be run with the -v (no verbose) option to not print the headers and footers. If the -w option is used, the output record length is altered accordingly. But for any particular set of options, the output record sizes are identicle.

Diskcat has a capability with a -c option to create a 32 bit Checksum of the file. In sample runs on 486 and Pentium computers the HASH program took about 50-75% longer to run. This is because the hashing algorithm used is much more computationally intensive than the Checksum. When doing the same analysis using floppy disk files, the time differences were negligible. This is because the slow disk access time has an effect on the program execution time.

File List Sources: In some instances, the user may provide a list of files that are to be hashed. This list can be derived from any number of sources that the user has available. The "list" processing is similar to the upcopy -s source_list process. The user provides a text file containing the full path of each file to hash, and the program reads that list, and performs the required functions. Since this is a late add-on option, it has not -option pneumonic. However, it is implemented with the linux style --source=listfilename option. See options below.

A NOTE of caution.

If using either version of HASH on a 32 bit OS (NT, XP, WIN9X) file system, the “LAST ACCESS” time of the file will be changed. The calculation of the hash value requires the opening of the file for reading. This means any time a hash is calculated for a file the “LAST ACCESS” time stamp is altered. If you don’t want last access time altered, use the -R* option to reset the access time. See also -t option. The preferred method of operation to capture the proper date and time, and perform the hash is a two line batch file.
(C:>hash -p c:\ -t3 -o output1)
(C:>hash -p c:\ -o output2)
The reader is encouraged to determine the functionality of these two commands.

VERY IMPORTANT NOTE:

Since the program allows the OS to reset the Last Access Time, if the user wishes to have the original access date of the file restored, then the environment variable RESET must be set, or the -R option must be used. Test the operation of the version of HASH you are using, and verify the output with MDIR.

In the 16 bit version, when run from a DOS reboot of a WIN9X system, the 16 bit version doesn’t alter the last access date of files. However, you only get the 8.3 DOS filename in the output. A tradeoff.

See ITS ALL ABOUT TIME


Top

OUTPUT

Here is a sample of the default output to a file. Everything between the two lines of ******* (stars) is what would be contained in the output file. The output record is normally 160 characters wide (including the CR/LF) and has been shortened for clarity. It begins with the C:\TMP\.... and ends  with the Eastern Standard Time (EST/EDT:-5)

Depending on options used, the output record length is modified. However, it is always fixed in length based on the options chosen.

*****************************************************************
Started Sat Dec 28 19:20:25 2002 GMT, 14:20 Eastern Standard Time (EST/EDT:-5)
C:\UTILS\NTUTILS\HASH.EXE sedline.txt -o junk 

 -------- BEGIN PROCESSING MD5 -----------
C:\TMP\sedline.txt   139AE24DA60488F77A251CB29A012628   34 07/03/2002 16:09w EST 
 -------- END PROCESSING MD5 -----------

 Processed 17 directories, 1 files, 34 bytes: 
 Elapsed:  0 hrs. 0 mins. 1 secs.
**************************************************************

The items in the output file are:

1: Date and time the program was run
2: The command line that was run
3: The line  ———— BEGIN MD5 HASHING ————
    indicates the beginning of the the fixed length output records
4: The output records (fixed length) made up of:
     a: file being processed (full path)
     b: MD5 hash total (40 characters + 2 blanks) (or 40 blanks)
     c: File size
     d: File date
     e: File time (including NT time type (acw) if necessary)
     f: Time zone setting. (if one is in use or set)
5: The line  ————  END MD5 HASHING  ————
     indicates the end of the fixed length outputs
6: A line indicating how many files were processed.

The lines ----- BEGIN and ----- END ... are inserted so the users can easily identify the files processed. The ending parts (line 5 and 6) are removed for each time the file is appended to.

If comparisons against other runs need to be done, the files should could be compared in a data base environment. The program HASHCMP has been specially designed to compare output files created by the HASH program.

A suggestion on how to use this program

Create a reference output file of all the programs on the disk. At a later date, create a second output file, and compare the 1st and 2nd using the HASHCMP program. If changes occurred, take action.


Top

ITS ABOUT TIME

If you were viewing from CRCKIT, BACK to CRCKIT
If you were viewing from DISKCAT BACK to DISKCAT

In Windows operating systems, file times are maintained using three different values. There is the “Creation Time” (when the file was originally created or written to that disk media), the “Last Write Time” (last time the file was written/modified), and the “LAST ACCESS DATE/TIME” (last time the file was accessed).

For FAT32 file systems, for the last access date and time field, only the date is maintained. The last access time on FAT32 file systems is always 00:00. Assume all references to WIN9x and NTFS take this into consideration.

Almost every application that opens a file for reading changes the “LAST ACCESS” time of the file. This means if you use a program that merely “views” the contents of the file, you may very well be altering the “LAST ACCESS TIME” of the file. If this is a major concern, and in some investigations the last access time could be very important, determine before hand whether the particular application alters the access time. (You may use the 32 bit version of MDIR to verify file time alterations.) At the very least, you will be altering that part of the disk where the last access time is stored. (The windows TYPE, MORE, and PRINT commands, OutsideIn, Quick View Plus and many others all alter access times). Unless you have tested and confirmed otherwise, assume all programs alter last access time.

If you use CRCKIT, HASH, or (DISKCAT with the -h or crc option) the last access time is changed by the operating system every time the program is run. (the HASH -t3 option does not open files, and thus is the only hash option that doesn’t change the access times).

If you want to have the program attempt to RESET the last access time back to its original value, you can do it in one of two ways. The first way is to use the -R option. The -R option tells the program to attempt to reset the last access time to the original value before the program ran. This will be accomplished successfully on all files except those “LOCKED” by the operating system. Those files are traditionally the system files. They can never have their last access time reset.

The second way is to set an environment variable called RESET. (set RESET=1) If the program detects the RESET variable, it will always attempt to reset the access time to its original value. This is identicle to the -R option.

Setting/resetting the last access time could have evidentiary consequences, and the user should be certain that a sound explanation is available.

After the file has been opened and the calculation has been made, if the -R (RESET) option is set, the file times will be maintained and not altered. However, there are some concerns:

1. Even though the last access time is reset to the original before the program examined the file, the program is technically changing the disk. The disk is first changed by the operating system to set a current last access time and then the -R causes the program to reset the file time to the original. The ultimate effect is no change in substance (value of “LAST ACCESSS TIME” is as it was before the program was run ). However, the disk has actually been changed twice. Once by the system, and once by the program.

2. If the file being looked at is a system type file (in use by the operating system) or if the file has a readonly attribute set, then the program cannot replace the original file access time, and the new one, set by the operating system is used. This definitely produces a change in the last access time. Again the program has no control over this. It is the operating system which sets the time. The program does however produce a message on the screen that it cannot reset the file time. So the user will be able to determine which files have had times changed.

Some examples of how NTFS treats different operations.

a (+) plus sign means this time is altered, and is usually the current time, a (-) minus sign means the time is left as is, the (*) means the write time of the source file is maintained on this new file.

                    Affect on:
Operation:       Access  Create  Write

COPY (source)      +       -      -
COPY (dest.)       +       +      *  (write time of the source is used)
PRINT              +       -      -
MSWORD (save)      +       +      +
MSWORD (print)     +       -      -  (close without alteration)
Quick View Plus    +       -      -
DIR (FILE MANAGER) -       -      -

The last access date for FAT32 file sysytems only maintains the date of access and not the time.

The last access time of NTFS file systems is updated only in hour increments. This means you could access a file three times within one hour, but only one time update would occur. (Microsoft could change this at any time, so do your proper due diligence when this is an important factor.)

When working with the 32 bit operating systems you should familiarize yourself thoroughly with the consequences and side effects of altering file times when using any programs that open/view or copy files.

Also you should take note of the CMOS time settings on the suspect computer with regard to time zone settings, Daylight Savings time settings, and the local time the computer is maintaining. Some of these setting can be altered/set within the autoexec.bat of the suspect computer. Any or all of these settings affect the way the file times are displayed on your forensic machine if the settings are not identicle.

This is not an absolute, just a caution. For this reason, HASH and CRCKIT have options (-Z[ulu]) to "normalize" the times from local to UTC/GMT. If you are dealing with many computers from different time zone sites than your own, you might want to deal with GMT. This should eliminate any differences in machine settings. All of this is with the caveat that the suspects machine originally had a time set that was reasonably accurate for his/her time zone. I suggest the investigator check out time anomolies on files created on differing systems.

DON'T FORGET:

Any read/open/view etc. of the file by almost any program WILL BE ALTERING THE HARD DISK, AND EVIDENCE.

If you were viewing from CRCKIT, BACK to CRCKIT
If you were viewing from DISKCAT, BACK to DISKCAT

else


Top

OPTIONS

Usage: hash    -[options]

At least 1 initial file or path is recommended. For additional paths or filetypes use -p and/or -f options. If only a file name used, current default path is used, and recursed from there.

This program is INI capable. INI keywords in [BOLD]

All options should be preceded by a (-) minus sign. Some can be grouped together, and others where specified MUST be grouped without a space. The options are grouped where approriate.

DO NOT include the + sign or the colon (:) in you command line. The + sign is used to indicate that this option takes a modifier or additional information.

Some options because they deal with specific 32 bit items like MDS or file times are only active in the 32 bit version running on an appropriate file system.

-p + path(s):  If more than one directory is needed to be looked at, then add the paths here as appropriate. (-p c:\windows    d:\work)   [PATH=path]

-f + filespec:  If more than one file type is needed, add them here. (-f   *.c   *.obj   *.dll)   [FILES=filetype]

If these options are used, the program builds a matrix of paths and file types. It searches all the requested directories for all the requested file types. Thus giving a total of all the files in all the paths requested. These options are added to any default command line provided. (C:>hash c:\work\*.c -f *.dll -p d:\windows)

-x + filespec:  e(x)clude these file types from listing. Maximum of 100 file types accepted. (same format as -f option) (-x thesefiles.txt) [EXCLUDE=filetype]

-oO + filename:  Output file name. Place the output to a filename. If uppercase ‘O’ then existing output is appended to. [OUTPUT=filename]

-a: append output to filename provided in -o option. Serves same purpose as using an upper case O. [APPEND=[ON|OFF]]

-1 + filename:  (that's a one, not ell) The filename here is a file which will contain accounting/log information about the run. It is always appended to, and contains the command line, and statistics about how many files and time of run. The file can later be used as a batch file for duplicating the runs. The ACCT environment variable can also be set. (SET ACCT=logfilename). Or use the .INI option [ACCT=filename] The order of priority is: Environment, INI file, Command Line option. To explicity turn off use a +1.

-C + "comment"  Add a "comment" to the beginning of every record. This is very useful when ultimaely merging many outputs from different locations or for different cases. The comment can uniquely identify the sources of the hash values. Example, (-C SUSPECT_CPU#1). The resulting output records would look something like this: "SUSPECT_CPU#1 C:\WINNT\....\filename etc."

-C + COMPUTERNAMExx  A special version of the -C option. If the literal COMPUTERNAME (all uppercase) is used, then the program will find the name of the computer and insert it there. This is kind of like a wildcard subsitution. The user can let the system decide what to put there. This can then uniquely identify the source computer of the hash values. Example, (-C COMPUTERNAME). The resulting output records would look something like this: "CPU-2_ATLANTA C:\WINNT\....\filename etc.". If the xx is replaced by a numeric value, then the computer name field is made this many characters wide. (-C COMPUTERNAME20) becomes: "CPU-2_ATLANTA        C:\WINNT\....\filename etc."

-S:  If the file system is NTFS, this option causes all Alternate Data Stream files to be processed also. [STREAM=[ON|OFF]]

Hash calculation options: (-s -A -B -c -256 -384 -512) Default option is MD5 128 bit.

-s:  produce the 160 bit SHA output instead of the 128 bit MD5 hash.

-B:  produce Both the MD5 and SHA of a file. (This option available only for 32 bit version.)

-256:  produce the 256 bit SHA2 calculation. (not compatible with default MD5 128 bit)

-384:  produce the 384 bit SHA2 calculation. (not compatible with default MD5 128 bit)

-512:  produce the 512 bit SHA2 calculation. (not compatible with default MD5 128 bit)

-c:  produce a 32 bit CRC output instead of the 128 bit MD5 hash.

-A:  This is a very special option. It causes the hash to be computed, and also includes all three (3) file date/times in the output. The original access date is captured and maintained in the output record even though after the hash calculation is preformed, the current access date is modified. This output record is very large (over 180 characters wide). This option also includes in the output record the file attributes. In effect, if gives you almost everything you would want to know about the file (except the file type based on header). (THIS OPTION IS ONLY AVAILABLE IN THE 32 BIT VERSION)

Note: The use of -256, -384, -512, will provide each of the calculations. If you wish to get both the MD5 and SHA1 the -B option is implemented for this. If you want to add the three file times, the -A (for ALL times) is implemented for this. -AB option will provide 128 bit, 160 bit and 3 file times.

-g + #:  Where the # is replaced by a number indicating, list all files ‘g’reater than # days old. You can use a -gl pair to bracket file ages. [OLDER=xxx]

-1 + #:  (ell, not one) Where the # is replaced by a number indicating, list all files ‘l’ess than # days old. You can use a -gl pair to bracket file ages. To get todays files, use (-l 1) [NEWER=xxx]

-g + mm-dd-yyyy
-l + mm-dd-yyyy
:  (that's and ell, not a one). Process only those files (g)reater (older) than or (l)ess than (newer) than this mm-dd-yyyy date. The date MUST be in the form mm-dd-yyyy. It MUST have two digit month and days (leading 0 if necessary), and it MUST have a 4 digit year. The date given mm-dd-yyyy is NOT included in the calculation. Ie. if today was 01-10-2003 and you entered -l 01-09-2003 you would only process todays files. If you wanted to include those on 01-09, you should have entered -l 01-08-2003.

-g + #    Where the # is replaced by a number indicating: list all files ‘g’reater than # days old. You can use a -gl pair to bracket file ages. [OLDER]=50

-l + #    (ell, not one) Where the # is replaced by a number indicating: list all files ‘l’ess than # days old. You can use a -gl pair to bracket file ages. To get todays files, use (-l 1) [NEWER]=10

-g + mm-dd-yyyy[acw]
Process only those files (g)reater (older) than this mm-dd-yyyy date. The date MUST be in the form mm-dd-yyyy. It MUST have two digit month and days (leading 0 if necessary), and it MUST have a 4 digit year. The date calculation is calculated as of midnite on the date given for the -g option of mm-dd-yyyy. For this reason, the day provided is NOT included in the calculation. Ie. if you entered -g 01-01-2006 you would only process dates PRIOR to 1/1/2006. This means all of 2005 and before. See below for the [acw] meanings.

-l + mm-dd-yyyy[acw]:  (that's and ell, not a one). Process only those files (l)ess than (newer) than this mm-dd-yyyy date. The date MUST be in the form mm-dd-yyyy. It MUST have two digit month and days (leading 0 if necessary), and it MUST have a 4 digit year. The date calculation is calculated as of midnite on the date given for the -l option of mm-dd-yyyy. For this reason, the day provided IS included in the calculation. Ie. if you entered -l 01-01-2006 you would process all of 2006 to the current date.

Special note for the [acw] modifier part of the option.

If no 'acw' modifier is used, the default time used to check the age is the current write or last modification time.

You can however, alter which time is used in the age calculation. To do this, add any or all of the acw indicators. For instance, if you wanted the date checking to respond to the access date, you would add an 'a'.    ie: -l 10-10-2005a would show all files accessed on or after 10-10-2005.

If you added more letters, to the date, ie:   -g 10-10-2005cw    you would get all files with EITHER an access or a last modified date older than 10-10-2005. The added [acw] times are logically OR'd. So any date meeting the criteria will cause it to be selected for processing.

The use of all three -g 10-10-2005acw allow the program to simultaneously check and evaluate all three dates.

Caution should be exercised in using all three dates, as in most cases, almost every file may fit the criteria.

-L + #:  Where the # is replaced by a number indicating, list all files less than # bytes in size. (-L 100000) [LESSTHAN=xxx]

-G + #:  Where the # is replaced by a number indicating, list all files greater than # bytes in size. You can use a -GL pair to bracket file sizes. (-G 10000) (-G 10000 -L 100000) [GREATER]=10000

-P:  Pause after every 20 lines. (default is not to pause after every screen.) [PAUSE=[ON|OFF]

-d + “delimeter”:  replace “delimeter” with a delimeter (typically a pipe ‘ |’ ) within double quotes with which to delimet fields. If the delimeter is not printable, use its decimal ascii value but don’t place it it quotes. (-d “|”) [DELIMETER=xx]

-w + #:  Change the default width of the filename from 38 to whatever value you wish. If you have long filenames, this may be necessary to accomodate the entire name. If a filename longer than 38 is used, the output tends to be more than one line long. Usually a -w 160 will suffice to get all but the most extreme long file names. (-w 50) [WIDTH=xx]

-M:  When doing the pre-scan of the drive to count the number of files, also calculate the (-M)aximum number of characters needed for the longest filename, and treat it as if the -w # option was used. This automatically sets the -w option to the correct value.

-[tT] + [acw30]:  Show the file time as last ‘a’ccessed, last ‘w’ritten, ‘c’reated, or show all ‘3’. No spaces between the -t and the modifier. ( -tc or -t3 ) If the -t3 option is used the program DOES NOT open the file and thus does not change the access date. In this case, all three file times are placed into the output record.

If the -T is uppercase, then the date is reversed to reflect YYYY/MM/DD. This format fascilitates sorting on date and time.

Default is the ‘w’rite time, which is identicle to what DIR or Explorer displays. Note: The 3 file time capability is only available under 32 bit operating systems using the 32 bit version of the program. (L) The Linux version has differenet -t options, because Linux display of file times might be a little different.

Some of the options (-sAB 256, 384, 512) may conflict in logic with the -t3 and -t0 options. If a -t3 is used, the default is to NOT perform any hashing. Use this to perform a simple catalog without changing file access dates. To obtain all three times, and an MD5 hash, you should use the -A option which will ALWAYS override the -t3 and insert the MD5. To add an SHA1, use include the -B (both MD5 and SHA1). The inclusion of the -B elicits only a single time, even if the -t3 is used. To get three times when using the -B, you must also use the -A which add the times. The logic here is somewhat convoluted, but the matrix is hard to design. The user should test the options.

[TIME=[A|C|W|3]], [ALLTIMES=[ON|OFF]]

-Z:  If using 32 bit version, display time in ‘Z’ULU UTC/GMT format. The letters GMT will be at the end of the output line indicating such. Use GMT to get relative references especially when dealing with 2 or more time zones. See note below on time zones: (-z) [ZULU=[ON|OFF]]

-m:  Show file last write (-modified) date. Same as -tw option. (-m) [MILITARY]=[ON|OFF]

-N:  Provide in the output only the path/filename and the calculation. No dates, times or file sizes are included.

-n:  Strip the path from the filename, and list only the filename itself.

-8:  Add the DOS 8.3 filename to the end of the record.

-88:  Add the uppercase Long File Name to the end of the record. This option strips the LFN from the path listing of the first field, and places only the LFN at the end of the record. The default length is a 75 character field. (Note: the -8 and -88 options are mutually exclusive. Use one or the other).

-88xx:  Replace the xx with a value. This value will now determine how wide the Long File Name field will be. The default LFN length for hash is 25 characters.

-R:  Reset file last access time

-v:  Silent run. NO VERBOSE. Do not print normal column headings above numbers. This provides cleaner screen output for redirection to a file. This can also be accomplished by settting an environment variable called silent to ON. (set SILENT=ON). The SILENT environment variable is used by crckit also. The output at this point is ready for import into a data base. [SILENT=[ON|OFF]]

--source=listfilename:  Provide a list of files to hash in the file identified by the name: listfilename. One filename per line. The filename must contain the complete path of the file to hash. The program reads the text file one line at a time and processes that file. There should be a blank line at the end to indicate no more files to process.


Time Zones

If you are using the 32 bit version in a DOS box, the time zone is properly displayed at the end of the record.

C:\WORK\PUBLISH\HASH.DOC 
 AC38FF51EAAF04739B0F7FCCB7001762        4697  03/31/1995  12:12:28w EST

This is provided your OS has been properly set up to the correct time zone. This is accomplished in the control panel under the date/time icon.

However, if you are using the 16 bit version either from a DOS boot, or in a DOS box, you must set a TZ environment variable to tell the program the proper time zone. Otherwise it will always respond with a time zone of PST. To set the TZ variable use something like:

SET TZ=EST4EDT

Or whatever time zone is applicable. If you don't know what an environment variable is, or don't know how to set it, you will have to do your own research.


Top

COMMAND LINES

c:>hash c:\ -o a:c_drive
Do hash of files for entire C: drive.

c:>hash c:\work
Do hash of files in path C:\work

c:>hash c:\work -r -S
do C:\work path without recursion, process Alternate Data Streams

c:>hash c:\work\*.c
do C:\work path with for all *.c files (add -r for no recursion)

c:>hash c:\work -n
do C:\work printing only filename

c:>hash c:\work -w 30
do C:\work printing 30 characters of filename

c:>hash *.c -c
create CRC32 instead of MD5 of all *.c files


RELATED PROGRAMS

CRCKIT

DISKCAT

DISK_CRC

HASHCMP

MD5

Top