Goto Chapter: Top 1 2 3 4 5 6 7 8 9 10 11 Ind
 [Top of Book]  [Contents]   [Previous Chapter]   [Next Chapter] 

4 High level functions for buffered I/O
 4.1 Types and the creation of File objects
 4.2 Reading and writing
 4.3 Other functions
 4.4 Inter process communication

4 High level functions for buffered I/O

The functions in the previous sections are intended to be a possibility for direct access to the low level I/O functions in the C library. Thus, the calling conventions are strictly as in the original.

The functionality described in this section is implemented completely in the GAP language and is intended to provide a good interface for programming in GAP. The fundamental object for I/O on the C library level is the file descriptor, which is just a non-negative integer representing an open file of the process. The basic idea is to wrap up file descriptors in GAP objects that do the buffering.

Note that considerable care has been taken to ensure that one can do I/O multiplexing with buffered I/O. That is, one always has the possibility to make sure before a read or write operation, that this read or write operation will not block. This is crucial when one wants to serve more than one I/O channel from the same (single-threaded) GAP process. This design principle sometimes made it necessary to have more than one function for a certain operation. Those functions usually differ in a subtle way with respect to their blocking behaviour.

One remark applies again to nearly all functions presented here: If an error is indicated by the returned value fail one can use the library function LastSystemError (Reference: LastSystemError) to find out more about the cause of the error. This fact is not mentioned with every single function.

4.1 Types and the creation of File objects

The wrapped file objects are in the following category:

4.1-1 IsFile
‣ IsFile( o )( category )

Returns: true or false

The category of File objects.

To create objects in this category, one uses the following function:

4.1-2 IO_WrapFD
‣ IO_WrapFD( fd, rbufsize, wbufsize )( function )

Returns: a File object

The argument fd must be a file descriptor (i.e. an integer) or -1 (see below).

rbufsize can either be false for unbuffered reading or an integer buffer size or a string. If it is an integer, a read buffer of that size is used. If it is a string, then fd must be -1 and a File object that reads from that string is created.

wbufsize can either be false for unbuffered writing or an integer buffer size or a string. If it is an integer, a write buffer of that size is used. If it is a string, then fd must be -1 and a File object that appends to that string is created.

The result of this function is a new File object.

A convenient way to do this for reading or writing of files on disk is the following function:

4.1-3 IO_File
‣ IO_File( filename[, mode] )( function )
‣ IO_File( filename[, bufsize] )( function )
‣ IO_File( filename, mode, bufsize )( function )

Returns: a File object or fail

The argument filename must be a string specifying the path name of the file to work on. mode must also be a string with possible values "r", "w", or "a", meaning read access, write access (with creating and truncating), and append access respectively. If mode is omitted, it defaults to "r". bufsize, if given, must be a positive integer or false, otherwise it defaults to IO.DefaultBufSize. Internally, the IO_open (3.2-43) function is used and the result file descriptor is wrapped using IO_WrapFD (4.1-2) with bufsize as the buffer size.

The result is either fail in case of an error or a File object in case of success.

Note that there is a similar function IO_FilteredFile (4.4-9) which also creates a File object but with additional functionality with respect to a pipeline for filtering. It is described in its section in Section 4.4. There is some more low-level functionality to acquire open file descriptors. These can be wrapped into File objects using IO_WrapFD (4.1-2).

4.2 Reading and writing

Once a File object is created, one can use the following functions on it:

4.2-1 IO_ReadUntilEOF
‣ IO_ReadUntilEOF( f )( function )

Returns: a string or fail

This function reads all data from the file f until the end of file. The data is returned as a GAP string. If the file is already at end of file, an empty string is returned. If an error occurs, then fail is returned. Note that you still have to call IO_Close (4.2-16) on the File object to properly close the file later.

4.2-2 IO_ReadBlock
‣ IO_ReadBlock( f, len )( function )

Returns: a string or fail

This function gets two arguments, the first argument f must be a File object and the second argument len must be a positive integer. The function tries to read len bytes and returns a string of that length. If and only if the end of file is reached earlier, fewer bytes are returned. If an error occurs, fail is returned. Note that this function blocks until either len bytes are read, or the end of file is reached, or an error occurs. For the case of pipes or internet connections it is possible that currently no more data is available, however, by definition the end of file is only reached after the connection has been closed by the other side!

4.2-3 IO_ReadLine
‣ IO_ReadLine( f )( function )

Returns: a string or fail

This function gets exactly one argument, which must be a File object f. It reads one line of data, where the definition of line is operating system dependent. The line end character(s) are included in the result. The function returns a string with the line in case of success and fail in case of an error. In the latter case, one can query the error with LastSystemError (Reference: LastSystemError).

Note that the reading is done via the buffer of f, such that this function will be quite fast also for large amounts of data.

If the end of file is hit without a line end, the rest of the file is returned. If the file is already at end of file before the call, then a string of length 0 is returned. Note that this is not an error but the standard end of file convention!

4.2-4 IO_ReadLines
‣ IO_ReadLines( f[, max] )( function )

Returns: a list of strings or fail

This function gets one or two arguments, the first of which must always be a File object f. It reads lines of data (where the definition of line is operating system dependent) either until end of file (without a second argument) or up to max lines (with a second argument max. A list of strings with the result is returned, if everything went well and fail oterwise. In the latter case, one can query the error with LastSystemError (Reference: LastSystemError).

Note that the reading is done via the buffer of f, such that this function will be quite fast also for large amounts of data.

If the file is already at the end of file, the function returns a list of length 0. Note that this is not an error but the standard end of file convention!

4.2-5 IO_HasData
‣ IO_HasData( f )( function )

Returns: true or false

This function takes one argument f which must be a File object. It returns true or false according to whether there is data to read available in the file f. A return value of true guarantees that the next call to IO_Read (4.2-6) on that file will succeed without blocking and return at least one byte or an empty string to indicate the end of file.

4.2-6 IO_Read
‣ IO_Read( f, len )( function )

Returns: a string or fail

The function gets two arguments, the first of which must be a File object f. The second argument must be a positive integer. The function reads data up to len bytes. A string with the result is returned, if everything went well and fail otherwise. In the latter case, one can query the error with LastSystemError (Reference: LastSystemError).

Note that the reading is done via the buffer of f, such that this function will be quite fast also for large amounts of data.

If the file is already at the end of the file, the function returns a string of length 0. Note that this is not an error!

If a previous call to IO_HasData (4.2-5) or to IO_Select (4.3-3) indicated that there is data available to read, then it is guaranteed that the function IO_Read does not block and returns at least one byte if the file is not yet at end of file and an empty string otherwise.

4.2-7 IO_Write
‣ IO_Write( f[, things, ...] )( function )

Returns: an integer or fail

This function can get an arbitrary number of arguments, the first of which must be a File object f. All the other arguments are just written to f if they are strings. Otherwise, the String function is called on them and the result is written out to f.

Note that the writing is done buffered. That is, the data is first written to the buffer and only really written out after the buffer is full or after the user explicitly calls IO_Flush (4.2-10) on f.

The result is either the number of bytes written in case of success or fail in case of an error. In the latter case the error can be queried with LastSystemError (Reference: LastSystemError).

Note that this function blocks until all data is at least written into the buffer and might block until data can be sent again if the buffer is full.

4.2-8 IO_WriteLine
‣ IO_WriteLine( f, line )( function )

Returns: an integer or fail

Behaves like IO_Write (4.2-7) but works on a single string line and sends an (operating system dependent) end of line string afterwards. Also IO_Flush (4.2-10) is called automatically after the operation, such that one can be sure, that the data is actually written out after the function has completed.

4.2-9 IO_WriteLines
‣ IO_WriteLines( f, list )( function )

Returns: an integer or fail

Behaves like IO_Write (4.2-7) but works on a list of strings list and sends an (operating system dependent) end of line string after each string in the list. Also IO_Flush (4.2-10) is called automatically after the operation, such that one can be sure, that the data is actually written out after the function has completed.

4.2-10 IO_Flush
‣ IO_Flush( f )( function )

Returns: true or fail

This function gets one argument f, which must be a File object. It writes out all the data that is in the write buffer. This is not necessary before the call to the function IO_Close (4.2-16), since that function calls IO_Flush automatically. However, it is necessary to call IO_Flush after calls to IO_Write (4.2-7) to be sure that the data is really sent out. The function returns true if everything goes well and fail if an error occurs.

Remember that the functions IO_WriteLine (4.2-8) and IO_WriteLines (4.2-9) implicitly call IO_Flush after they are done.

Note that this function might block until all data is actually written to the file descriptor.

4.2-11 IO_WriteFlush
‣ IO_WriteFlush( f[, things] )( function )

Returns: an integer or fail

This function behaves like IO_Write (4.2-7) followed by a call to IO_Flush (4.2-10). It returns either the number of bytes written or fail if an error occurs.

4.2-12 IO_ReadyForWrite
‣ IO_ReadyForWrite( f )( function )

Returns: true or false

This function takes one argument f which must be a File object. It returns true or false according to whether the file f is ready to write. A return value of true guarantees that the next call to IO_WriteNonBlocking (4.2-13) on that file will succeed without blocking and accept at least one byte.

4.2-13 IO_WriteNonBlocking
‣ IO_WriteNonBlocking( f, st, pos, len )( function )

Returns: an integer or fail

This function takes four arguments. The first one f must be a File object, the second st a string, and the arguments pos and len must be integers, such that positions pos+1 until pos+len are bound in st. The function tries to write up to len bytes from st from position pos+1 to the file f. If a previous call to IO_ReadyForWrite (4.2-12) or to IO_Select (4.3-3) indicates that f is writable, then it is guaranteed that the following call to IO_WriteNonBlocking will not block and accept at least one byte of data. Note that it is not guaranteed that all len bytes are written. The function returns the number of bytes written or fail if an error occurs.

4.2-14 IO_ReadyForFlush
‣ IO_ReadyForFlush( f )( function )

Returns: true or false

This function takes one argument f which must be a File object. It returns true or false according to whether the file f is ready to flush. A return value of true guarantees that the next call to IO_FlushNonBlocking (4.2-15) on that file will succeed without blocking and flush out at least one byte. Note that this does not guarantee, that this call succeeds to flush out the whole content of the buffer!

4.2-15 IO_FlushNonBlocking
‣ IO_FlushNonBlocking( f )( function )

Returns: true, false, or fail

This function takes one argument f which must be a File object. It tries to write all data in the writing buffer to the file descriptor. If this succeeds, the function returns true and false otherwise. If an error occurs, fail is returned. If a previous call to IO_ReadyForFlush (4.2-14) or IO_Select (4.3-3) indicated that f is flushable, then it is guaranteed that the following call to IO_FlushNonBlocking does not block. However, it is not guaranteed that true is returned from that call.

4.2-16 IO_Close
‣ IO_Close( f )( function )

Returns: true or fail

This function closes the File object f after writing all data in the write buffer out and closing the file descriptor. All buffers are freed. In case of an error, the function returns fail and otherwise true. Note that for pipes to other processes this function collects data about the terminated processes using IO_WaitPid (3.2-66).

4.3 Other functions

4.3-1 IO_GetFD
‣ IO_GetFD( f )( function )

Returns: an integer

This function returns the real file descriptor that is behind the File object f.

4.3-2 IO_GetWBuf
‣ IO_GetWBuf( f )( function )

Returns: a string or false

This function gets one argument f which must be a File object and returns the writing buffer of that File object. This is necessary for File objects, that are not associated to a real file descriptor but just collect everything that was written in their writing buffer. Remember to use this function before closing the File object.

4.3-3 IO_Select
‣ IO_Select( r, w, f, e, t1, t2 )( function )

Returns: an integer or fail

This function is the corresponding function to IO_select (3.2-55) for buffered file access. It behaves similarly to that function. The differences are the following: There are four lists of files r, w, f, and e. They all can contain either integers (standing for file descriptors) or File objects. The list r is for checking, whether files or file descriptors are ready to read, the list w is for checking whether they are ready to write, the list f is for checking whether they are ready to flush, and the list e is for checking whether they have exceptions.

For File objects it is always first checked, whether there is either data available in a reading buffer or space in a writing buffer. If so, they are immediately reported to be ready (this feature makes the list of File objects to test for flushability necessary). For the remaining files and for all specified file descriptors, the function IO_select (3.2-55) is called to get an overview about readiness. The timeout values t1 and t2 are set to zero for immediate returning if one of the requested buffers were ready.

IO_Select returns the number of files or file descriptors that are ready to serve or fail if an error occurs.

The following function is a convenience function for directory access:

4.3-4 IO_ListDir
‣ IO_ListDir( pathname )( function )

Returns: a list of strings or fail

This function gets a string containing a path name as single argument and returns a list of strings that are the names of the files in that directory, or fail, if an error occurred.

4.3-5 ChangeDirectoryCurrent
‣ ChangeDirectoryCurrent( pathname )( function )

Returns: true on success and fail on failure

Changes the current directory. Returns true on success and fail on failure.

The following function is used to create strings describing a pair of an IP address and a port number in a binary way. These strings can be used in connection with the C library functions connect, bind, recvfrom, and sendto for the arguments needing such address pairs.

4.3-6 IO_MakeIPAddressPort
‣ IO_MakeIPAddressPort( ipstring, portnr )( function )

Returns: a string

This function gets a string ipstring containing an IP address in dot notation, i.e. four numbers in the range from 0 to 255 separated by dots ".", and an integer portnr, which is a port number. The result is a string of the correct length to be used for the low level C library functions, wherever IP address port number pairs are needed. The string ipstring can also be a host name, which is then looked up using IO_gethostbyname (3.2-23) to find the IP address.

4.3-7 IO_Environment
‣ IO_Environment( )( function )

Returns: a record or fail

Takes no arguments, uses IO_environ (3.3-2) to get the environment and returns a record in which the component names are the names of the environment variables and the values are the values. This can then be changed and the changed record can be given to IO_MakeEnvList (4.3-8) to produce again a list which can be used for IO_execve (3.2-13) as third argument.

4.3-8 IO_MakeEnvList
‣ IO_MakeEnvList( r )( function )

Returns: a list of strings

Takes a record as returned by IO_Environment (4.3-7) and turns it into a list of strings as needed by IO_execve (3.2-13) as third argument.

4.4 Inter process communication

4.4-1 IO_FindExecutable
‣ IO_FindExecutable( path )( function )

Returns: fail or the path to an executable

If the path name path contains a slash, this function simply checks whether the string path refers to an executable file. If so, path is returned as is. Otherwise, fail is returned. If the path name path does not contain a slash, all directories in the environment variable PATH are searched for an executable with name path. If so, the full path to that executable is returned, otherwise fail.

This function is used whenever one of the following functions gets an argument that should refer to an executable.

4.4-2 IO_CloseAllFDs
‣ IO_CloseAllFDs( exceptions )( function )

Returns: nothing

Closes all file descriptors except those listed in exceptions, which must be a list of integers.

4.4-3 IO_Popen
‣ IO_Popen( path, argv, mode )( function )

Returns: a File object or fail

The argument path must refer to an executable file in the sense of IO_FindExecutable (4.4-1).

Starts a child process using the executable in path with either stdout or stdin being a pipe. The argument mode must be either the string "r" or the string "w".

In the first case, the standard output of the child process will be the writing end of a pipe. A File object for reading connected to the reading end of the pipe is returned. The standard input and standard error of the child process will be the same as in the calling GAP process.

In the second case, the standard input of the child process will be the reading end of a pipe. A File object for writing connected to the writing end of the pipe is returned. The standard output and standard error of the child process will be the same as in the calling GAP process.

In case of an error, fail is returned.

The process will usually die, when the pipe is closed, but can also do so without that. The File object remembers the process ID of the started process and the IO_Close (4.2-16) function then calls IO_WaitPid (3.2-66) for it to acquire information about the terminated process.

Note that IO_Popen activates our SIGCHLD handler (see IO_InstallSIGCHLDHandler (3.3-3)).

In either case the File object will have the attribute "ProcessID" set to the process ID of the child process.

4.4-4 IO_Popen2
‣ IO_Popen2( path, argv )( function )

Returns: a record or fail

The argument path must refer to an executable file in the sense of IO_FindExecutable (4.4-1).

A new child process is started using the executable in path. The standard input and standard output of it are pipes. The writing end of the input pipe and the reading end of the output pipe are returned as File objects bound to two components "stdin" and "stdout" (resp.) of the returned record. This means, you have to write to "stdin" and read from "stdout" in the calling GAP process. The standard error of the child process will be the same as the one of the calling GAP process.

Returns fail if an error occurred.

The process will usually die, when one of the pipes is closed. The File objects remember the process ID of the called process and the function call to IO_Close (4.2-16) for the stdout object will call IO_WaitPid (3.2-66) for it to acquire information about the terminated process.

Note that IO_Popen2 activates our SIGCHLD handler (see IO_InstallSIGCHLDHandler (3.3-3)).

Both File objects will have the attribute "ProcessID" set to the process ID of the child process, which will also be bound to the "pid" component of the returned record.

4.4-5 IO_Popen3
‣ IO_Popen3( path, argv )( function )

Returns: a record or fail

The argument path must refer to an executable file in the sense of IO_FindExecutable (4.4-1).

A new child process is started using the executable in path The standard input, standard output, and standard error of it are pipes. The writing end of the input pipe, the reading end of the output pipe and the reading end of the error pipe are returned as File objects bound to two components "stdin", "stdout", and "stderr" (resp.) of the returned record. This means, you have to write to "stdin" and read from "stdout" and "stderr" in the calling GAP process.

Returns fail if an error occurred.

The process will usually die, when one of the pipes is closed. All three File objects will remember the process ID of the newly created process and the call to the IO_Close (4.2-16) function for the stdout object will call IO_WaitPid (3.2-66) for it to acquire information about the terminated child process.

Note that IO_Popen3 activates our SIGCHLD handler (see IO_InstallSIGCHLDHandler (3.3-3)).

All three File objects will have the attribute "ProcessID" set to the process ID of the child process, which will also be bound to the "pid" component of the returned record.

4.4-6 IO_StartPipeline
‣ IO_StartPipeline( progs, infd, outfd, switcherror )( function )

Returns: a record or fail

The argument progs is a list of pairs, the first entry being a path to an executable (in the sense of IO_FindExecutable (4.4-1)), the second an argument list, the argument infd is an open file descriptor for reading, outfd is an open file descriptor for writing, both can be replaced by the string "open" in which case a new pipe will be opened. The argument switcherror is a boolean indicating whether standard error channels are also switched to the corresponding output channels.

This function starts up all processes and connects them with pipes. The input of the first is switched to infd and the output of the last to outfd.

Returns a record with the following components: pids is a list of process ids if everything worked. For each process for which some error occurred the corresponding pid is replaced by fail. The stdin component is equal to false, or to the file descriptor of the writing end of the newly created pipe which is connected to the standard input of the first of the new processes if infd was "open". The stdout component is equal to false or to the file descriptor of the reading end of the newly created pipe which is connected to the standard output of the last of the new processes if outfd was "open".

Note that the SIGCHLD handler of the IO package is installed by this function (see IO_InstallSIGCHLDHandler (3.3-3)) and that it lies in the responsibility of the caller to use IO_WaitPid (3.2-66) to ask for the status information of all child processes after their termination, or call IO_IgnorePid (3.2-67) to ignore the return value of a process.

4.4-7 IO_StringFilterFile
‣ IO_StringFilterFile( progs, filename )( function )

Returns: a string or fail

Reads the file with the name filename, however, a pipeline is created by the processes described by progs (see IO_StartPipeline (4.4-6)) to filter the content of the file through the pipeline. The result is put into a GAP string and returned. If something goes wrong, fail is returned.

4.4-8 IO_FileFilterString
‣ IO_FileFilterString( filename, progs, st[, append] )( function )

Returns: a string or fail

Writes the content of the string st to the file with the name filename, however, a pipeline is created by the processes described by progs (see IO_StartPipeline (4.4-6)) to filter the content of the string through the pipeline. The result is put into the file. If the boolean value append is given and equal to true, then the data will be appended to the already existing file. If something goes wrong, fail is returned.

4.4-9 IO_FilteredFile
‣ IO_FilteredFile( progs, filename[, mode][, bufsize] )( function )

Returns: a File object or fail

This function is similar to IO_File (4.1-3) and behaves nearly identically. The only difference is that a filtering pipeline is switched between the file and the File object such that all things read or written respectively are filtered through this pipeline of processes.

The File object remembers the started processes and upon the final call to IO_Close (4.2-16) automatically uses the IO_WaitPid (3.2-66) function to acquire information from the terminated processes in the pipeline after their termination. This means that you do not have to call IO_WaitPid (3.2-66) any more after the call to IO_Close (4.2-16).

Note that IO_FilteredFile activates our SIGCHLD handler (see IO_InstallSIGCHLDHandler (3.3-3)).

The File object will have the attribute "ProcessID" set to the list of process IDs of the child processes.

4.4-10 IO_CompressedFile
‣ IO_CompressedFile( filename[, mode][, bufsize] )( function )

Returns: a File object or fail

This function is a convenience wrapper around IO_FilteredFile (4.4-9) which handles a number of common compressed file formats transparently, by calling an external program. The arguments to this function are identical to IO_File (4.1-3). If the extension to filename is one of gz, bz2 or xz, then the file is transparently compressed/uncompressed using gzip, bzip2 or xz respectively. If the extension is none of these, then the command behaves identically to IO_File (4.1-3).

Note that as this function calls IO_FilteredFile (4.4-9), it will activate our SIGCHLD handler (see IO_InstallSIGCHLDHandler (3.3-3)).

When compression / decompression is active, the File object will have the attribute "ProcessID" set to the list of process IDs of the child processes.

4.4-11 IO_SendStringBackground
‣ IO_SendStringBackground( f, st )( function )

This functions uses IO_Write (4.2-7) to write the whole string st to the File object f. However, this is done by forking off a child process identical to the calling GAP process that does the sending. The calling GAP process returns immediately, even before anything has been sent away with the result true. The forked off sender process terminates itself immediately after it has sent all data away.

The reason for having this function available is the following: If one uses IO_Popen2 (4.4-4) or IO_Popen3 (4.4-5) to start up a child process with standard input and standard output being a pipe, then one usually has the problem, that the child process starts reading some data, but then wants to write data, before it received all data coming. If the calling GAP process would first try to write all data and only start to read the output of the child process after sending away all data, a deadlock situation would occur. This is avoided with the forking and backgrounding approach.

Remember to close the writing end of the standard input pipe in the calling GAP process directly after IO_SendStringBackground has returned, because otherwise the child process might not notice that all data has arrived, because the pipe persists! See the file popen2.g in the example directory for an example.

Note that with most modern operating systems the forking off of an identical child process does in fact not mean a duplication of the total main memory used by both processes, because the operating system kernel will use "copy on write". However, if a garbage collection happens to become necessary during the sending of the data in the forked off sending process, this might trigger doubled memory usage.

4.4-12 IO_PipeThrough
‣ IO_PipeThrough( cmd, args, input )( function )

Returns: a string or fail

Starts the process with the executable given by the file name cmd (in the sense of IO_FindExecutable (4.4-1)) with arguments in the argument list args (a list of strings). The standard input and output of the started process are connected via pipes to the calling process. The content of the string input is written to the standard input of the called process and its standard output is read and returned as a string.

All the necessary I/O multiplexing and non-blocking I/O to avoid deadlocks is done in this function.

This function properly does IO_WaitPid (3.2-66) to wait for the termination of the child process but does not restore the original GAP SIGCHLD signal handler (see IO_InstallSIGCHLDHandler (3.3-3)).

4.4-13 IO_PipeThroughWithError
‣ IO_PipeThroughWithError( cmd, args, input )( function )

Returns: a record or fail

Starts the process with the executable given by the file name cmd (in the sense of IO_FindExecutable (4.4-1)) with arguments in the argument list args (a list of strings). The standard input, output and error of the started process are connected via pipes to the calling process. The content of the string input is written to the standard input of the called process and its standard output and error are read and returned as a record with components out and err, which are strings.

All the necessary I/O multiplexing and non-blocking I/O to avoid deadlocks is done in this function.

This function properly does IO_WaitPid (3.2-66) to wait for the termination of the child process but does not restore the original GAP SIGCHLD signal handler (see IO_InstallSIGCHLDHandler (3.3-3)).

The functions returns either fail if an error occurred, or otherwise a record with components out and err which are bound to strings containing the full standard output and standard error of the called process, and status which is the status returned from the exiting process.

 [Top of Book]  [Contents]   [Previous Chapter]   [Next Chapter] 
Goto Chapter: Top 1 2 3 4 5 6 7 8 9 10 11 Ind

generated by GAPDoc2HTML