There are several query processing tasks which characterize work
frequently done on flat files.
First, each record in a file must be read in and split apart into fields, according to the field separators. In Cymbal, this is most easily accomplished by using the tokens() function to iteratively read records and to split them up into fields. For example, when reading records from a file with | as the field separator, the following paradigm can be used:
for_each_time [ .vbl, ..., .vbl ]
is_such_that(
[ .vbl, ..., .vbl ]
= tokens( via _file_ for "file_path" upto "\n|" )
...
) do {
/* anything you want to do with .vbl, ..., .vbl */
}
The file is conceived of being a long sequence of tokens separated by
either newlines or bars.
Daytona satisfies the assertion once for each record's worth of tokens.
The ftokens abbreviation can be used in that ftokens( ... ) is
an abbreviation for tokens( via _file_ ... ).
Information is taken from each record read in to cumulatively update one or more aggregate quantities being calculated over the entire file.
By merely using the with_sep argument to the Write() PROCEDURE, users cause the written quantities to be separated by the with_sep string. Quantities being written are converted to strings as needed.
with_sep "|" do Write_Line( "abc", 45.7, ^4-15-99^, .x, .y+.z );Perl users would use join() in conjunction with print().
Cymbal uses concat() to concatenate quantities together as strings separated by an optional separator string. Non-string quantities are converted to strings automatically. As the import from sys.env.cy indicates, concat() takes a TUPLE of arguments as its first argument and an optional separator string as its second.
/* automatically included from sys.env.cy */ import: STR FUN( manifest TUPLE[ ( 1-> ) OK_OBJ ], STR .sep = "" ) concat /* an example of use */ set .x = concat( [ 12, "abc", .z, my_fun(.x, 45.67 ), ^1-1-97^ ], "<-->" );Perl uses the join function for this purpose.
It is very easy for Cymbal to sort TUPLES in a lexicographic way according to increasing or decreasing values of their components in any order. One of several ways to do this is by means of Is_The_Next_Where which is used to create a sorted box of all TUPLES satisfying an assertion:
for_each_time [ .vbl, ..., .vbl ]
is_such_that(
[ .vbl, ..., .vbl ] Is_The_Next_Where( assertion )
sorted_by_spec [ signed-integer-sequence ]
){
/* anything you want to do with the sorted .vbl, ..., .vbl */
}
If [ signed-integer-sequence ] were [ 2, -4 ], the TUPLES would be sorted first by increasing values of the second component and then by decreasing values of the 4th component of each TUPLE. Duplicate TUPLES can be removed simply by replacing Is_The_Next_Where with Is_Something_Where.
|
||||||