---------------------------------------------------------
The Liberty Basic Newsletter - Issue #33 - APR 99
"Knowledge is a gift we receive from others."
		- Michael T. Rankin
---------------------------------------------------------
In This Issue:

Parts One and Two of a multi-part series on Disk File
Functions by Dean Hodgson.  Many thanks to Dean for
this OUTSTANDING series!
------------
In Future Issues:

- Parts 3, 4 and 5 of Dean's series on Disk File Functions.

- Debugging.

- More great articles by guest authors!
---------------------------------------------------------
DISK FILE HANDLING IN LIBERTY BASIC
By Dean Hodgson
copyright (c) 1999
dhodgson@nexus.edu.au

Part 1 - Basic principals of disk file handling
Part 2 - LB native disk file commands

This series of articles is intended to provide you with a 
broad understanding of disk file handling techniques in 
Liberty Basic. There are many ways to achieve this, and 
the method you employ depends on what you need to do. It 
is not the intention of these articles to provide you with 
ready-made code you can cut and paste directly into your 
programs. Rather, the concentration will be on the concepts 
behind disk file handling with short examples given.

Note: the statements DefaultDir$, Drive$, FILEDIALOG, FILES, 
KILL, MKDIR, NAME and RMDIR are not covered in these articles.

=====================================================================

Part 1 - Basic Principles of Disk File Handling

There are three main types of files: Sequential, Random Access 
and Binary.

Sequential files contain data that must be read from the start 
to the finish every time. These files typically contain text 
information stored line-by-line. The whole file is read into 
variables every time it is used. The data is updated in memory 
then the entire file is written back to the disk. Sequential 
files are great when the amount of data is limited and/or only 
its order is important.

In Random Access files, individual records of data are read into 
the computer's memory, with usually only one record being operated 
on at a time. Random files can be very large, much larger than the 
memory capacity of the computer. Data can be read from or written 
to any spot in the file at any time. Random files are the 
'backbone' of most database applications and are essential when 
sharing files in network situations.

A third type of file is a Binary file. In this type, the file 
may or may not have a particular record structure like a Random 
file and/or it may or may not be sequential. Executable programs 
are Binary files. Liberty Basic only has limited Binary file 
handling capability. However, Binary files can be accessed via 
API and Deanslib functions.

A typical file procedure is:
   Open the file
     perform work on it
   Close the file

It is a good idea not to keep files open any longer than 
necessary. Try not to open a file at the start of your program 
and leave it open until the program finishes. This is especially 
important when sharing files on networks. And remember: there 
are *always exceptions* to any rule anyone puts up.

===================================================================

Part 2 - LB Disk File Commands

Liberty Basic has a range of builtin or 'native' disk file 
commands for dealing with both sequential and random access 
files. The statements are OPEN, CLOSE#, INPUT#, LINE INPUT#, 
PRINT#, INPUT$(), FIELD#, GET#, and PUT#.

-------------------------------------------------------------------

SEQUENTIAL FILES
----------------
In a typical Sequential file, data is stored in a particular 
order but the length of each line can vary. In LB, a sequential 
file is opened either for Input or Output:

This opens a sequential file for reading:
  OPEN "filename" FOR INPUT AS #lbfilehandle
And this opens a file for writing:
  OPEN "filename" FOR OUTPUT AS #lbfilehandle

"filename" is the name of your file. It can be up to 8 characters 
long followed by a full stop, then an optional three-character 
extension. While Dos permits other symbols, it is safest to stick 
to letters, numbers, the dash (-) and the underscore (_) symbols. 
Many punctuation symbols are interpreted by Dos as having special 
functions -- i.e. *, / \ ? ~ and so forth. The case of filename 
letters is not important as Dos automatically changes them 
internally to upper case.

The #lbfilehandle is a word, phrase or number you assign to the 
open file. This is not the Windows file handle value, which is 
always a number and which is not accessible for LB disk commands. 
It is also not a variable as such but a label designating the 
handle you have assigned.

Trying to write to a file opened for Input generates an error 
message as does trying to read a file opened for Output. Trying 
to open a file that does not exist generates an error -- therefore 
if you are unsure it is worth testing to see if a file exists first. 
If a file already exists and you open it for output, that file 
will be cleared. If the file does not exist, it is automatically 
created.

To close an open file use CLOSE #lbfilehandle.

To read data from an open Sequential file, use INPUT#, 
LINE INPUT# or INPUT$(). To write data use PRINT#.

Every line in a sequential file ends in two special characters: 
a carriage-return (ASCII value of 13) then a line-feed 
(ASCII value of 10). LB's commands automatically deal with 
these characters.

INPUT#lbfilehandle, var1$, var2
-------------------------------
This statement inputs data from a sequential file opened for 
reading into the variables listed. The variables listed should 
match those used in the PRINT# statement to write the data. More 
than one variable can be listed, separated by commas. The INPUT# 
statement interprets commas as separating multiple variables on 
the same line of data. A carriage-return / line-feed combination 
signifies the end of a line. INPUT# does not remove quotation marks 
around string variables, and it skips the end of line characters.

LINE INPUT#lbfilehandle, var$
-----------------------------
This statement reads an entire line of text into a string variable 
including all characters. Reading is stopped when a carriage-return 
character is encountered. Line-feed characters (ASCII value 10) are 
ignored. All other characters are read, including commas and quotes.

PRINT#lbfilehandle, var1$ ; "," ; var2 ...
------------------------------------------
Print# is the main statement for writing sequential file data. 
The data is printed out to the file as text. If a semi-colon is 
omitted at the end of the PRINT# statement, then a 
carriage-return/line-feed combination is also automatically 
written to mark the end of the line. If the statement ends in 
a semi-colon the end-of-line marker is not written. Multiple 
variables can be listed, but each must be separated by a ;","; 
combination and not just a simple comma. In LB, PRINT# does not 
automatically put quotation marks around string variables. If you 
require this for your data to be in "comma-delimited format" then 
you must add these to the variable before printing to the file 
-- i.e. A$=CHR$(34)+"Data"+CHR$(34).

Here is an example that creates and then reads a simple sequential 
file:

A$="abc"                            'This is the data to be written
B$="def"
OPEN "TEMP.TXT" FOR OUTPUT AS #1    'The file Temp.Txt is opened for 
                                    ' output using filehandle #1.
PRINT#1,A$;",";B$                   'The data is written to the line
CLOSE #1                            'The file is closed
OPEN "TEMP.TXT" FOR INPUT AS #2     'Open the file for reading as #2
INPUT#2,A$,B$                       'Input the two variables
CLOSE #2                            'Close the file
PRINT A$                            'Display the first variable
PRINT B$                            'Display the second variable

You should see abc for A$ and def for B$.

It is possible to add quote marks and commas to strings. However, 
using INPUT, LB will recognise the comma as separating two variables.

A$=CHR$(34)+"123,456"+CHR$(34)      'This is the data to be written;
                                    ' note the comma and quotes!!
OPEN "TEMP.TXT" FOR OUTPUT AS #1    'The file Temp.Txt is opened for
                                    ' output using filehandle #1.
PRINT#1,A$                          'The data is written to the line
CLOSE #1                            'The file is closed
OPEN "TEMP.TXT" FOR INPUT AS #2     'Open the file for reading as #2
INPUT#2,A$,B$                       'Input the two variables
CLOSE #2                            'Close the file
PRINT A$                            'Display the first variable
PRINT B$                            'Display the second variable

In this example, A$ is read and displayed as "123. You'll see the 
quote mark at the start. In many other Basics, 123,456 would have 
been read into A$ without quotes around it. This is because other 
Basics interpret the quotation marks as surrounding one whole 
string. LB does not do this! If it sees a comma within a string, 
it stops reading into the variable.

The way to include commas is LINE INPUT#. If LINE INPUT#2,A$ 
were used instead of INPUT#2,A$,B$ above then the line "123,456" 
would be read, comma and all. A statement line LINE INPUT#2,A$,B$ 
reads two lines.

LB cannot read data from sequential files directly into an array. 
Instead, you have to read into a simple variable then assign it 
to the array. If your file were to contain the numbers from 1 to 10, 
one number per line, the following could be used to read it:

DIM a$(10)                          'allocate space for the array
OPEN "testfile.txt" FOR INPUT AS #1 'open the file
FOR counter=1 TO 10                 'could through each line
  LINE INPUT#1,temp$                'input a line into a temp variable
  a$(counter)=temp$                 'assign to the array element
NEXT counter                        'end of the loop
CLOSE #1                            'close the file

INPUT$(#handle,length)
----------------------
There is one other input statement used in LB: INPUT$. This is a 
function that reads a specified number of characters into a string.  
Example:  a$=INPUT$(#1,50) would read 50 characters and assign them 
to a$. LB reads the entire length specified regardless of the 
characters found! INPUT$ therefore transcends typical Sequential 
file statements and crosses over into the Binary file, where there 
may be no set structure to the data.

END OF FILE DETECTION (EOF)
---------------------------
LB contains the function EOF(#filehandle) to detect when the end 
of a file has been reached. It is used in situations where the
length of a file or its structure may not be known. EOF returns 
either a 0 (false) if the end of file has not been reached, or 
a -1 (true) if it has. In the example above, if we didn't know 
how many numbers were in the file we could have used:

DIM a$(100)                         'allocate space for the array
OPEN "testfile.txt" FOR INPUT AS #1 'open the file
counter=0                           'initialize the counter
WHILE EOF(#1)=0                     'loop until end of file found
  LINE INPUT#1,temp$                'input line into a temp variable
  counter=counter+1                 'increment the array pointer
  a$(counter)=temp$                 'assign to the array element
WEND                                'end of the loop
CLOSE #1                            'close the file

In this case the variable 'counter' holds the number of entries 
read into the array.

LENGTH OF FILE (LOF)
--------------------
The LOF(#filehandle) function returns the length of the open file 
in bytes. This can be handy if you don't know the size of a file 
and want to read the whole thing using INPUT$ into one string 
variable:

    OPEN "testfile.txt" FOR INPUT AS #1   'open the file
    a$=INPUT$(#1,LOF(#1))                 'read whole file!
    CLOSE #1                              'close the file

Because LB strings can be megabytes in size, really large files 
can be read this way. Parts of the file can then be examined using 
the MID$() string function. Of course, because the whole file is 
within one string variable, you'll have to work out your own way 
of breaking the parts up.

--------------------------------------------------------------------

RANDOM ACCESS FILES
-------------------
As stated previously, Random Access files use a "record" structure 
where the entire file is divided into many records. Each record has 
the same length -- i.e. is the same number of bytes long. Each 
record is also subdivided into "fields", each one of which is a 
given length. These are called fixed-length fields and records. 
Variable-length record files are possible but their complex 
structure is beyond the scope of these articles.

OPEN "filename" FOR RANDOM AS #lbfilehandle LEN=value
-----------------------------------------------------
This variant of OPEN is used for opening random access files. 
RANDOM means it is a random type file that can be either read 
or written to (using the same OPEN). Note the LEN=statement at 
the end, which specifies the length of each record.

FIELD#filehandle,length AS var$ , ...
-------------------------------------
The fields in a random file are set up using the FIELD# statement. 
Each field has a variable name and specified length. As an example, 
let's say that we want to keep a simple database about a collection 
of books. The fields we want might include Title, Authors, Series, 
Comments and the number of pages.

  OPEN "books.dat" FOR RANDOM AS #1 LEN=425
  FIELD#1,_
    100 AS Title$,_
    60 AS Authors$,_
    60 AS Series$,_
    200 AS Comments$,_
    5 AS Pages

The value after LEN= is 100 + 60 + 60 + 200 + 5 or the total 
length of all the fields in FIELD#. Field# always follows the 
OPEN random file statement and must be used before GET# or PUT#. 
Note that for the numeric variable Pages, the space necessary to 
store the number of digits in the largest number we expect to have 
is reserved. LB stores numeric variables as text rather than in 
IEEE, BCD, MBF or other formats used by other languages. These 
format store numbers in native 'machine format'. An integer, for 
example, is a number between -32767 and +32767 and is stored as 
only 2 bytes instead of 6. This avoids having to use MKI, CVI, 
etc. as is needed in Quick Basic.

PUT#filehandle,recordnumber
---------------------------
This statement writes the data held in the variables listed in 
FIELD# to the specified record number. The first record number 
in a LB random file is 1 (not 0). If the length of a given field 
is shorter than what is specified, LB adds blank spaces to fill it 
out. If the length is larger, LB truncates the string. This is 
automatic 'left justification' and is a real advantage over 
Basic, where you must use LSET or RSET to justify the data before 
using PUT.

GET#filehandle,recordnumber
---------------------------
The GET# statement reads an entire record and fills the variables 
listed in FIELD#. The same list of variables should be in GET# 
that is in the PUT# which was used to create the record.

GETTRIM#filehandle,recordnumber
-------------------------------
GET# above reads each field 'as is' including blank spaces at the 
end of each field. GETTRIM removes leading and trailing blank spaces 
around each variable listed in FIELD#. This is something unique to 
LB and is very handy.

Basically that is the gist of random access files using LB's 
commands. Here is a short example that creates two records.

    OPEN "TEMP.DAT" FOR RANDOM AS #1 LEN=15   'open file
    FIELD#1,10 AS A$,5 AS B$                  'set up fields
    A$="123" : B$="456"                       'data for first record
    PUT#1,1                                   'write the record
    A$="789" : B$="ABC"                       'data for second record
    PUT#1,2                                   'write the record
    CLOSE #1                                  'close the file
    OPEN "TEMP.DAT" FOR RANDOM AS #2 LEN=15   'open file
    FIELD#2,10 AS A$,10 AS B$                 'set up fields
    GET#2,1                                   'read first record
    PRINT A$                                  'show fields
    PRINT B$
    GET#2,2                                   'read second record
    PRINT A$                                  'show fields
    PRINT B$
    CLOSE #2                                  'close file

-------------------------------------------------------------------

LIBERTY BASIC LIMITATIONS
-------------------------
LB's file commands are very useful and easy to use but do have a 
few limitations.

* With a random access file, there is no easy way to read or write 
to part of a record. GET# and PUT# only deal with whole records. 
There is no function to position the 'file pointer' to a specific 
spot in a file or record and then read or write selected data.

* Network access of files is not supported via these commands. 
Files are opened for single-user access only. On a network, a 
computer trying to open a file that has been opened by another 
computer generates a sharing violation error.

* LB can only read text data. There is no ability to read numeric 
data created by a different program and stored in IEEE or another 
format.

Using the API file functions can overcome some of these limitations. 
Deanslib.Dll functions can be used to overcome others.

=====================================================================

---------------------------------------------------------
 Newsletter compiled and edited by: Brosco and Alyce.
 Comments, requests or corrections: Hit 'REPLY' now!
---------------------------------------------------------