PARSER.EXE is a utility for processing both fixed length and field delimited ASCII flat-file representations of databases. Record lengths and Field positions are defined in a Profile File as are operations to perform on the fields. Fields can be rearranged, truncated, padded, reordered, characters deleted or replaced, & characters inserted. The Profile File also specifies Text delimiters, field delimiters, and record delimiters. Through careful crafting of the Profile File, PARSER.EXE can be used to create comma-delimited files, or other fixed length ASCII flat-files. PARSER.EXE can also be used to create a file for each record, using one of the fields from the original record as a filename, or using a default filename. Output files can be straight text files, or HTML files.
PARSER.EXE INFILE.TXT [-oOUTFILE.TXT] [-pPROFILE.INI] -? Where: INFILE.TXT = Name of fixed length ASCII flat file. -oOUTFILE.TXT = Results are written to the file OUTFILE.TXT, uses the standard output if not specified. -pPROFILE.INI = PROFILE.INI is the name of Profile File. -? = Provides the Syntax for using Parser.
The Profile File can be created using any text editor. Its general format is as follows:
! Any line starting with ! is a comment line ! ! Everything before the line starting with [parser] is ignored. ! All command arguments are space delimited. ! [parser] ! ! The first section defines parameters needed to write the ! OUTFILE.TXT ! record_per_file 0 ! filename_field Field1 ! Not used if record_per_file is set to 0 fixed_field ! Optional indicates source file has fixed field lengths ! delimited_field $, $" ! Indicates comma is field delimiter, " is text delimiter ! Not used if fixed_field is used. print_column_names 0 field_delimiter $\ record_delimiter $\n start_of_text_char $" end_of_text_char $" record_length 62 output_text ! Output file is pure text (default) ! output_html ! Output file is in HTML ! ! The following lines define the fields in INFILE.TXT ! Note: Field names may not have embedded spaces ! field Field1 1 10 TEXT 1 field Field2 11 10 NUMBER 2 field Field3 21 10 TEXT 3 field Field4 31 10 TEXT 4 field Field5 41 10 TEXT 5 field Field6 51 10 TEXT 6 field Field7 61 1 TEXT 0 ! field_text Field8 Text\ inserted\ in\ Field8\ with\ print_order\ 8 8 field_counter Field9 9 field_filename_field Field10 10 field_lookup Field11 Field1 lookup.txt 2 11 field_date Field12 12 field_time Field13 13 ! ! The following lines define the operations to perform on the ! the fields defined above. The operations are performed on the ! fields in the order they are entered in the PROFILE. ! delete_alpha Field1 delete_nonalpha Field2 delete_numeric Field3 delete_nonnumeric Field4 delete_left Field5 1 delete_right Field6 1 keep_left Field1 5 keep_right Field2 5 insert_left Field3 <<< insert_right Field3 >>> replace_char Field7 $. $+ replace_text Field7 Original\ Text Replacement\ Text strip Field6 reorder Field6 3 4 $/ 5 6 $/ 1 2 delete_char Field5 $- pad_left Field1 10 pad_right Field2 10 to_upper_case Field3 to_lower_case Field4 ! [eof] ! ! All lines after the line starting with [eof] are ignored.
3.1 record_per_file 0
If record_per_file is set to anything but 0, this command
results in a file being written for each record. The filename is either produced
based on a field specified by filename_field, or is automatically generated.
If OUTFILE.TXT is specified on the command line, it is ignored. If record_per_file is set to 0, this command prints all of the records to the OUTFILE.TXT specified on the command line.
3.2 filename_field Field1
This command is only active if record_per_file
is set to anything but 0. It specifies which field should
be used to generate the filenames for each record. The contents
of this field should be less than 8 characters and be limited
to characters allowable for filenames. The field for each record
must be unique, or records will be lost. The filename is generated
by taking the contents of the specified field and appending .txt
3.3 fixed_field
This command indicates that the input file consists
of fixed length fields and records. It should not be used if delimited_field
is also defined. If neither delimited_field nor fixed_field
is defined, then the input file is assumed to be fixed_field.
3.4 delimited_field $, $"
This command indicates that the input file consists
of delimited fields. The first character argument is the field
delimiter and the second character argument is the text delimiter.
It should not be specified in additon to fixed_field.
3.5 print_column_names 0
This command enables the printing of the field
names as the first record of OUTFILE.TXT if record_per_file is
set to 0. If record_per_file is set to
1, the field name is printed prior to the field and has
a leading @ character attached. Set print_column_names
to 1 to enable printing of field names, and Set to 0
(default) to disable printing of field names.
3.6 field_delimiter $\
This command specifies the character
or characters (max 31) to use for separating fields in OUTFILE.TXT.
The default character is the comma. In this example, the field
delimiter is being set to a space. If record_per_file
is set to 1, field_delimiter should normally be
set to the character $\n. If multiple characters are
specified, they should be separated by a space.
3.7 record_delimiter $\n
This command specifies the character
or characters (max 31) to use for separating records in OUTFILE.TXT.
The default character is the newline character. If record_per_file
is set to 1, field_delimiter specifies the character(s)
to terminate the final line of each file with. If multiple characters
are specified, they should be separated by a space.
3.8 start_of_text_char $"
This command specifies the character
or characters (max 31) to use for starting text records in OUTFILE.TXT.
The default character is the " character. If multiple
characters are specified, they should be separated by a space.
3.9 end_of_text_char $"
This command specifies the character
or characters (max 31) to use for ending text records in OUTFILE.TXT.
The default character is the " character. If multiple
characters are specified, they should be separated by a space.
3.10 record_length 62
This command specifies the length in characters
of each record in INFILE.TXT. The length should include the
record delimiter in INFILE.TXT it it exists. If a length of 0 is
specified, then each record is assumed to be terminated with a newline
character.
3.11 output_text
This command specifies that the output file should
consist only of text with no headers or footers. output_text
should not be specified in addition to output_html.
3.12 output_html
This command specifies that the output file should
be in the HTML format. output_html should not be specified
in addition to output_text.
3.13 field Field1 1 10 TEXT 1
This command specifies a field in INFILE.TXT. In this example,
Field1 is the name of a field starting
with the first character of a record and is 10 characters
long. The TEXT argument means that the field, if printed
in OUTFILE.TXT will be preceded with the
start_of_text_char
and followed with the end_of_text_char.
The final argument, 1 in this case, is an integer that
determines the print order of the fields. The fields are printed
to OUTFILE.TXT in numerical order of the print order
argument. A print order argument less than or equal to zero suppresses
that field from being printed.
field Field2 11 10 NUMBER 2 This example defines Field2 to be a field starting with character 11 of a record and is 10 characters long. The print order of this field is 2.
3.14 field_text Field10 [text_to_print] 2
This command creates a field containing the specified
text. In this example, Field10
is set to the string [text_to_print] for every record.
The final number, 2 in this case, indicates the print
order. The fields are printed in numerical order of the print
order argument. A print order argument less than or equal to zero
suppresses that field from being printed.
3.15 field_counter Field9 4
This command creates a field containing the number
of the record, starting with 1 for the first record. In this example,
Field9 is set to 000001 for the first record,
000002 for the second record, 000003 for the
third record, and so on. The final number, 4 in this
case, indicates the print order.
3.16 field_filename_field Field8 5
This command creates a field containing the default
name for the file created if record_per_file
is set to 1 and filename_field
is not specified. The final number, 5 in this case, indicates
the print order.
3.17 field_lookup field8 field1 lookup.txt 2 9
This command creates a field containing text from
another file. In this case, field8 is created by finding
the line in lookup.txt that begins with the text contained
in field1. The 2 after lookup.txt indicates
that the second word of the matched line is inserted into field8.
The final number, 9 in this case, indicates the print
order.
3.18 field_date field2 5
This command creates a field containing the date
PARSER.EXE is executed in the format 12/31/1997.
The final number, 5 in this case, indicates the print
order.
3.19 field_time field 3 6
This command creates a field containing the time
PARSER.EXE is executed in the format 14:32:59.
The final number, 6 in this case, indicates the print
order.
3.20 delete_alpha Field1
This command deletes all alphabetic characters
(a-z and A-Z) from the specified field (Field1 in this
example) before printing to OUTFILE.TXT.
3.21 delete_nonalpha Field2
This command deletes all non-alphabetic characters
(all characters other than a-z and A-Z) from the specified field
(Field2 in this example) before printing to OUTFILE.TXT.
3.22 delete_numeric Field3
This command deletes all numeric characters (0-9)
from the specified field (Field3 in this example) before
printing to OUTFILE.TXT.
3.23 delete_nonnumeric Field4
This command deletes all non-numeric characters
(all characters other than 0-9) from the specified field (Field4
in this example) before printing to OUTFILE.TXT.
3.24 delete_left Field5 1
This command deletes from the field specified
(Field5) the number of characters specified (1)
from the beginning of the field.
3.25 delete_right Field6 1
This command deletes from the field specified
(Field6) the number of characters specified (1)
from the end of the field.
3.26 keep_left Field1 5
This command deletes from the field specified
(Field1) all of the characters except the number of characters
specified (5) starting with the beginning of the field.
3.27 keep_right Field2 5
This command deletes from the field specified
(Field2) all of the characters except the number of characters
specified (5) starting with the end of the field and
working back.
3.28 insert_left Field3 <<<
This command inserts at the beginning of the field
specified (Field3) the text
specified (<<<).
3.29 insert_right Field3 >>>
This command inserts at the end of the field specified
(Field3) the text specified
(>>>).
3.30 replace_char Field7 $. $+
This command replaces from the field specified
(Field7) all occurrences of the first character
specified (a period or $.) with the second
character specified (a plus sign or $+).
3.31 replace_text Field7 original\ text replacement\ text
This command replaces in the field specified (Field7)
all occurrences of the first text
argument specified (original \text) and replaces the
text with the second text argument (replacement\ text).
Note that embedded spaces must be preceded with a \.
3.32 strip Field6
This command deletes all leading and trailing
spaces, tabs, newlines, & carriage returns from the specified
field (Field6).
3.33 reorder Field6 3 4 $/ 5 6 $/ 1 2
This command reorders the characters of the field
specified (Field6) and can insert new characters as well.
In this case, the third character of Field6 is printed
first, followed by the fourth character, followed by the / character,
followed by the fifth character, followed by the sixth character,
followed by the / character, followed by the first character and
finally followed with the second character of Field6.
This command is very useful for rearranging date fields.
3.34 delete_char Field5 $-
This command deletes from the field specified
(Field5) all occurrences of the character
specified (a dash or $-).
3.35 pad_left Field1 10
This command inserts spaces at the beginning of
the specified field (Field1) so that the specified field
has the specified number of characters (10). If the specified
field has more than the specified number of characters, the field
is left unaltered. See keep_left and
keep_right to truncate fields.
3.36 pad_right Field2 10
This command inserts spaces at the end of the
specified field (Field1) so that the specified field
has the specified number of characters (10). If the specified
field has more than the specified number of characters, the field
is left unaltered. See keep_left and
keep_right to truncate fields.
3.37 to_upper_case Field3
This command converts all lower case characters
in the specified field (Field3) to upper case.
3.38 to_lower_case Field4
This command converts all upper case characters
in the specified field (Field4) to lower case.
4. Comments
4.1 Additional Command Line Arguments
Two additional command line arguments are unadvertised features
for the advanced user:
PARSER.EXE INFILE.TXT [-oOUTFILE.TXT] [-pPROFILE.INI] -d -tFIELD Where: -d = Set the Debug Flag to print out internal variables of the program. Used to find programming errors in PARSER.EXE -tFIELD = Results in only this one FIELD as defined in PROFILE.INI being displayed on the screen. Used to ensure the record_length command in PROFILE.INI is set properly
Within a PROFILE FILE, whenever a character is specified (with the exception of the reorder line) it can be specified by either its ASCII code in hexadecimal, or by the character itself preceded by the $ character. Additionally, several special characters are also defined. Examples include:
$A = the character 'A' 41 = the character 'A' as well, this is the hexadecimal code 20 = the space character $ = the space character too (This may not always work) $\ = the space character $\a = the Bell character $\b = the backspace character $\f = the form feed character $\n = the newline character $\r = the carriage return character $\t = the horizontal tab character $\v = the vertical tab character $\\ = the \ character
The reorder line does not allow the use of hexadecimal codes.
Within a PROFILE FILE, whenever text is specified, it should be specified using ASCII characters and the following special characters:
\ = the space character
\a = the Bell character
\b = the backspace character
\f = the form feed character
\n = the newline character
\r = the carriage return character
\t = the horizontal tab character
\v = the vertical tab character
\\ = the \ character
Embedded spaces should have a leading \ since text arguments are space delimited.
parser.exe test.txt -otestout.txt -ptest.ini
This program is free software; you can redistribute it and/or
modify it under the terms of the
GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
If you discover any bugs, or have any questions concerning these programs, please send me an email (doerry@aol.com)