Bates_no

PURPOSE   OPERATION   COMMAND LINES   OPTIONS   RELATED PROGRAMS


Author: Dan Mares, info @ maresware . com
This program may also be available for Linux (intel) platforms.
Portions Copyright © (2001, 2008) Mares and Company, LLC,
Phone/fax: (770)242-6687 X 119
Last page update: 3/21/2008


top

Purpose

The Bates_no program allows you to implement a file re-naming process similar to the Bates numbering system to assist attorneys and those who need to create unique filenames to associate with documents relating to a particular case. The program uses the idea of the Bates numbering system to fulfill this requirement.

This program allows the user to efficiently create a Bates numbering system within the logical file system on a computer. It will modify filenames or filename extensions to contain a unique Bates number.

Bates_no can also copy a number of files from a suspect location to a work directory. (This copy operation is similar to that performed by the Upcopy program except that here we are adding the Bates number.) At the same time it performs this copy operation, it renames (and numbers) the files to correspond to a Bates numbering system. This renaming procedure also keeps duplicate files from overwriting each other since the Bates numbers cause a unique name to be generated.

Once the files are renamed, a catalog can be compiled of the new file names (or with the use of the -o option, this catalog is automated) and be printed or used in the legal discovery process.

As of March 2008, enhancements have made it possible to eliminate the [square] bracketed index number [12345] of FTK exported files, while adding the unique sequential bates numbers. If you have a file with the Cope Special name matches, an output file with cross reference is created. A similar inhancement was added to allow for sequential numbering of X-Ways exported files while matching their names to the export text list created within the program. Both these capabilities are well used for e-discovery purposes.


top

Operation

The program can accomplish one of two tasks: Bates rename, or copy and Bates rename.

The user supplies, at a minimum, a starting drive or directory (folder) with which to begin, and the Bates nunber template for the numbers which are to be used in the operation. Other options are available to fine tune the file selection criteria for those files which are to targeted. Since the program defaults to renaming ALL files it finds, it is suggested that only those files to be renamed exist in the path provided (-p option) as a starting point.

Format of the AnAnANNN Alphanumeric Bates number template/mask.

The template is of the alphanumeric format: AnAnANNNN, where the AnAnA is the alpha root of the Bates numbers, and the NNNN is a starting number. The size (length) of the alpha or numeric section is arbitrary, and can be up to 16 characters. However, there can be NO spaces in the template. A sample template might be: DJM0000. It is suggested that the Alpha part of the number be unique enough so as not to conflict with any existing filename extensions or prior numbering passes. If files are found with extensions or names that are identical to the alpha section of the Bates number, the file may not be processed.

Suppose we have a mask ABC32D000 The alpha portion (ABC32D) of the template can contain both alpha, and numeric. However the following requirements exist: The alpha portion (ABC32D) must begin with an alpha character, and must end with an Alpha character. (ie., ABC32D). If numbers (32) are to be used within the alpha part of the mask, they must be within surrounding character sequences. The reason for this is that the numeric sequencing of the number section of the Bates number is triggered by the last alpha character of the mask. When the last alpha character is found (D) that determines the start of the sequencing. (000). The "minimum" sequence number length is determined by the width of the number part of the mask. If 000 is used, then the minimum number width is 3. So all sequence numbers will be 00X, etc. If the mask was only 0, then the sequence numbers in the Bates number would increase in width as the numbers grew to 10 and 100. For sorting, and other fixed width uses, it is suggested that a fixed width of at least 3 is always used (000). The number found in the mask (000, etc) is the starting number that will be used. So if you have previously used 000-100, then start the next set at 101, (ABC101).

The program begins searching the designated source location (drive or subdirectory/folder) for files. As it locates files it either renames or copies each file by first incrementing the numeric portion of the template and then inserting the template within the new filename.

Under this (default) RENAME operation

The template is inserted before the extension of each file. (If the file has no extension to begin with, then the Bates number becomes the extension.) The Bates number is "pre-pended" (put before) the name if the (upper case) -P option is used. (See the -P option.) Because the numeric portion is incremented each time, there evolves a unique number for each file. Coupled with the fact that the file now has a new name, this adds to the uniqueness. A file with the name:
D:\first_dir\second_dir\etc\filename.ext
would be renamed to:
D: \first_dir\second_dir\etc\filename.DJM000.ext

This results in totally unique filenames, identifiable by their Bates number.

The only restriction to the numbering template is that if the program is run more than once on a file system, subsequent templates must have a unique (different) alpha root. This is because of internal checks which have to be made to guarantee that a file doesn't get renamed more than once per session.

When the process is finished, a catalog can be run of the entire file system to obtain the new file names. Then put this into a searchable file. Or, you can use the -o option which will create on the fly a catalog of all the new file names.

Under the COPY and RENAME operation

The user would provide a destination path on the command line using the -d (destination) option. This destination path is used as a copy destination of all the files found. Be careful. If you point the program at a root of a drive for the source location, your destination path will be a single directory containing copies of every file on the source drive. This might choke some programs and operating systems. This copy operation is similar to the Upcopy program operation in every way, except this program includes the Bates number in the new filename.

Before the copy operation takes place, the template is inserted between the filename and the extension. The same type of Bates filename is produced as the renumber operation mentioned earlier. Because the numeric portion is incremented each time, a unique number is produced for each file even if the source drive contained similar file names in different directories (which is not unusual). The destination file now has a new name in the destination directory (provided by the user) and a filename consisting of the original filename, coupled with the Bates number inserted.

NOTE: the two operations are mutually exclusive. You can't renumber the files in place AND copy them (-d option). If you wanted to rename them in place, AND copy them, use the Bates_no program first to rename them in place, then use the Upcopy program to copy them.

Spreadsheets

Once the Bates numbering has been completed most users perform some sort of catalog listing of the newly created subdirectory (folder). The catalog can be created by using the Diskcat command, or by using the -o option (see below). This helps to confirm the new names and generates a well formatted output which can be easily imported into a spreadsheet program for further manipulation. Below is a sample file created when using the -o output option.


BATES0000|A:\TIMEREST.BATES0000.obj
BATES0001|A:\PROGRAM.BATES0001.obj
BATES0002|A:\OPTIONS.BATES0002.obj
BATES0003|A:\INIT_LIB.BATES0003.obj
BATES0004|A:\FIXNAME.BATES0004.obj
BATES0005|A:\EBC_ASC.BATES0005.obj
BATES0006|A:\BASE.BATES0006.obj
BATES0007|A:\ACCTING.BATES0007.obj

Imagine importing this into a spreadsheet and using the dots(.) as a field delimiter. You would get a spreadsheet looking like the one below (imagine that the columns are at the spaces). Then you could easily sort on Bates_no column, and/or create different spreadsheet outputs as you need to.


BATES0000|A:\TIMEREST.    BATES0000.    obj
BATES0001|A:\PROGRAM.     BATES0001.    obj
BATES0002|A:\OPTIONS.     BATES0002.    obj
BATES0003|A:\INIT_LIB.    BATES0003.    obj
BATES0004|A:\FIXNAME.     BATES0004.    obj
BATES0005|A:\EBC_ASC.     BATES0005.    obj
BATES0006|A:\BASE.        BATES0006.    obj
BATES0007|A:\ACCTING.     BATES0007.    obj

However, a problem occurs when a file that is being renamed doesn't have an extension. For example: FILENAME. When this file gets renamed, its Bates name looks like: FILENAME.BATES0000. Notice there is no extension or dot (.) after the Bates number. This could cause a problem for some spreadsheets in properly "columnizing" the outputs. So, if you think you will have files without extensions, and are anticipating moving the catalog to a spreadsheet, you might want to use the upper case -B BATESXXX option when numbering. The -B option will add extensions to those files that do not have them. The extension that is added is the unique .__! sequence of characters. (a dot, underscore, underscore, exclamation). This sequence is easy to find, and identify, and it also makes parsing the filename a lot easier when using spreadsheets.

BATES0008|A:\ACCTING. BATES0008. __!


March 2008 FTK and XWAYS enhancements.

As of March 2008, there are two enhancements (basically beta at this point), which make use of the exported files from FTK and X-WAYS forensic software.

FTK users are now able to delete the [indexno] reference from filenames, and add a traditional sequenced bates_no mask to the file. This is very useful for those doing e-discovery where bates or sequence numbers need to be properly indexed or in sequence.

For X-WAYS users who export/recover files, a unique bates number can also be added to these file in the export folders. And a new catalog capable of being imported into a spreadsheet is created. Read the option below to see its benefit.


top

Command Lines

C:>Bates_no [source_directory] [-[options]]

C:>Bates_no C:\tmp -b BATES_alpha_ROOT_Number

C:>Bates_no -p C:\tmp -b BATE_ROOT_Number
same as the previous one, except this one makes use of the -p option

C:>Bates_no -p c:\tmp -b BATE_ROOT_Number -f *.doc
perform opeartions only on the *.doc files

C:>Bates_no -p c:\tmp -b BATE_ROOT_Number -f *.doc -w 120
perform opeartions only on the *.doc files and makes the name section of the output 120 characters wide.

C:>Bates_no a: -d c:\temp_bates_dir -b BATES0000
copy the files from the A: drive to the c: directory identified by -d option.

C:>Bates_no -p source_parent_folder --XWaysfile=xways_copy_special_list -P -b bates_000
use the X-Ways catalog file xways_copy_special_list as a reference, find all files in the source_parent_folder directory which match the filenames in the list, and modify the filename using the bates mask.


top

Options

-p + source_dir    Use this directory as the source (starting point).

-[bB] + bates_number_template   This is the template for the Bates number. It is alphanumeric with no spaces. AAAANNNNN. The number part is used as a starting point for the sequencing. It is suggested that the number contain at least 4 or 5 digits with leading zeros as place holders so the final format is a nicely formatted fixed length Bates number. The actual length of this template is not restricted, but one of less than 10 characters is recommended.

The NNNNN portion of the mask is also used to determine the starting number. If the NNNNN portion is anything other than 00000 then the NNNNN value is taken to mean use this numeric value to start the numbering at. (ie. if the template was: DJM_012, then the file renaming would start at filename.DJM_012.ext instead of the default filename.DJM_000.ext)

Occasionally, files without extensions are found. (ex., FILNAME.) When these files are renamed, the new name is FILENAME.BATESXXX. There is no ending .EXT and this may cause some problems when importing into a spreadsheet. If you want an extension added to assist in spreadsheet importation, use the uppercase B, -B. This will add the extension of .__! It can later be easily identified, and it helps spreadsheet formatting. See Spreadsheets above.

-[uU]    Undo or Remove the bates number from files. The -b option must also be included so the program knows which template to check filenames against. All other options remain in effect. If the -o option is used, a list is provided of all old and newly renamed file names.

-P    'P'repend the Bates number to the filename. The default is to place the Bates number prior to the extension. (FILENAME.BATES100.EXT). This option allows you to prepend the Bates number to the filename. (BATES100.FILENAME.EXT). This option may be useful if you are later sorting the filenames. Since the Bates number will be at the front, they will sort more easily.

-d + path   : Path is a destination path (directory) to which EVERY file found will be renumbered and copied. Use extreme caution. This could create an extremely large SINGLE directory. No trees/paths are created under the destination path.

-f + filetype(s)    Rename only those files meeting this file type. Additional file types (max of 10) can be added by separating each one by a space. (ex,. -f *.c *.doc *.tmp *.ppt )

-x + filetype(s)   eXclude those files meeting this file type. Additional file types (max of 10) can be added by separating each one by a space. (ex,. -x myfile*.c )

-[Oo] + filename;    An output file to contain a listing of all the files which are renamed. The record format for the output file is: (BATES_NO0001|C:\PATH\FILENAME.BATES_NO0001.EXT). In most cases the output record is fixed in size (see note below). But the delimiter is there for compatibility. This effectively creates a catalog of all the newly named files.

If the upper case O (-O) is used, the source filename, including the path is added to the output record after another pipe (|) delimiter.
(BATES_NO0001|A:\PATH\FILENAME.BATES_NO0001.EXT|C:\SOURCE\FILENAME.EXT)

Note: the -w option can affect how many characters of the renamed file are printed here. If you want the entire full name printed, either don't use a -w option, or make it large enough to cover all the possibilities in the tree you are pointing to.

-v    No Verbose output. This eliminates the headers and footers in any output file generated. The output records are fixed length at this point, and the data file can be used as input to other programs.

-w + #   Replace # with a number indicating the width you wish the output record to be. This is the width of the path/filename in the -o output file. It is suggested that you also use the -v option to eliminate headers. Use this option to get a fixed length output that is compatable with most other Maresware programs.

-r    DO NOT recurse through the source directory for file. The default is that the source directory is recursed and ALL subsequent files and directories are processed.

-i    Proceed Immediately. Without this option the source tree is first scanned and files are counted so the user knows how many files are involved.

-g + #
-l + #   
Rename only those files (g)reater than or (l)ess than # days old. Replace the # with a valid number of days. And don't include the +.

-g + mm-dd-yyyy
-l + mm-dd-yyyy
:  (that's and ell, not a one). Rename only those files (g)reater (older) than or (l)ess than (newer) than this mm-dd-yyyy date. The date MUST be in the form mm-dd-yyyy. It MUST have two digit month and days (leading 0 if necessary), and it MUST have a 4 digit year. The date given mm-dd-yyyy is NOT included in the calculation. Ie. if today was 01-10-2003 and you entered -l 01-09-2003 you would only process todays files. If you wanted to include those on 01-09, you should have entered -l 01-08-2003.

-1 + logfilename   file to contain accounting information.

-h + filename  create an HTML file. (If no filename is given, then INDEX.HTM is used as a default.) The file will have links to all the files which were renamed with appropriate bates numbers. Use caution if this file named is included in the path which is being processed by the program. If the filename meets the command line requirements, it too will be renamed and included in the output. It is suggested this file be in a location other than the processing path

-H + filename  create an HTML file. (If no filename is given, then INDEX.HTM is used as a default. The file will have links to all the files identified using the appropriate command line options. However, this option DOES NOT rename the files with the bates number mask. It merely creates the html reference "index" file. For both the -h -H options, the output htm file, has html page breaks inserted at 45 lines. so that it is easy to print the list using a browser.

-t[acw]    Specify which time type to use in the calculations. The a= =access, c= =create, w= =last write/modify time. Don’t forget, in WIN9X, there is no access time.

-G + #
-L + #   
Rename only those files (g)reater than or (l)ess than # bytes in size. Replace the # with a valid file size.

-R  Reset file times. Because during a copy process, the files are opened and read, on WINNT and WIN9X the access date is modified. This option attempts to reset the source file date back to its original.


The FTK and XWAYS options in detail

Both options are implemented as the linux -- (minus minus) options.

--NOFTK:  Remove the FTK generated [square bracket] ID # from the name while renaming. Because it adds the bates_no mask to the filename, this still creates a unique final name without the [].

--FTKFILE=filename:  Filename is name of the tsv/csv file that was generated by the FTK "copy special" process. The resulting operation is that:

   1: The FTK[index] number is stripped from the renamed file.
   2: The filename is renamed using the bates number indexing. AND
   3: If the -O output file option is used, then the FTK data record
      relating to that index number is appended to each record of the output.
      All tabs are replaced by pipes (|) for easy import into spreadsheets.
The resulting files in those large FTK export directories now have unique sequenced bates numbers as part of their filenames.
An output file called: filename.new is created containing refernces to the old names, and the new names. Imported easily into spreadsheets.
Ideal for e-discovery purposes.

--XWaysfile=catalog_filename:   An enhancement for processing "exported" X-Ways files. (the XW is case sensitive and has to be uppercase for this option to work.)

The user has to already have produced the following:

1. A catalog of filenames referenced above by the word catalog_filename.
   This file contains the fields selected by the X-Ways option "Export List"
   This file is a tab delimeted file. Don't worry about the number of fields which
   XWAYS output. The only 2 required fields are: Path and Name fields.

2: A folder with the files identified in the file referenced in 1 above.
   Created using the "RECOVER/COPY" command, dropped in any top level output folder.

   You now have a filename containing the "catalog" of all the files exported AND
   the same files copied/exported to a destination folder.
Next make note and have the directory identified where the "exported" files reside.
Use this folder name as the item provided to the -p (path) option. As this is the path location you want the program to start at to find and rename the files. (which hopefully were exported correctly).

You will point the -p option to this top level folder where the exported files live.
The rest of the command line: -b bates_no -P (prepend) or any other options should work as always. However, be aware that this process is mutually exclusive with some other options such as the copy or move options. This option ONLY preforms the basic renaming of the files in PLACE.

If no file is found which would match the PATH/NAME combination in the catalog record nothing is done. The exported files in the -p folder, must have related records in the catalog lsiting.

A new logfile is created in the same location as the --XWays filename file. It has the name formateed as: filename.new
Notice the added .new extension. Which contains 2 more fields prepended to each record.
The first field is the bates number assigned the file.
The 2nd field is the NEWLY renamed filename);
Since this new output is tab delimeted, it should import into a spreadsheet easily.

The original catalog might have:
Name  Description   Ext. Type Status Type descr.  Path  Size  Created
While the new output has
Batesno   NewFilename  OriginalName  Description Ext. Type Status Type descr  Path Size Created
All tab delimeted.

Related Programs

Diskcat

Upcopy (to simply copy files while maintaining tree structure).

top