Terminal Concepts in GNU/Linux

Copyright © 2003 Charles M. "Chip" Coldwell.

Terminal Concepts in Linux

Introduction

Although it is rarely done today, one can connect a video terminal with a null modem cable to a serial port on a Linux box and get an interactive shell with very little effort. All that is required is that you add a line such as

dt:12345:respawn:/sbin/agetty -L ttyS0 9600 vt510

to /etc/inittab, and then run telinit q as root to get init to re-read the /etc/inittab file, and you will be greeted by a login prompt on the terminal screen.

The ease with which we can do this task that we almost never want to do is due to the long legacy of Unix that Linux inherited. When Unix was young, video terminals were the user interface of choice and became a very important concept in the design of the operating system. Today, although "glass teletypes" are becoming more and more rare, the terminal has remained at the heart of the operating system as a very powerful metaphor.

Most computer users sophisticated (or perhaps old) enough to know what a character cell terminal is probably think of them as relics of the past before the GUI revolutionized the user interface. Nonetheless, the first thing almost all Unix users do immediately after logging in to their elaborate desktops (and watching the dazzling eye candy meant to distract them from the intolerably long wait until the desktop system is ready to do something useful) is to throw up a terminal window to get an interactive shell into which they can type commands. Many probably think of it as just an interactive shell and not a terminal window, but in fact in Unix you cannot have an interactive shell without a controlling terminal.

The controlling terminal is one of the properties listed by the ps command, in the column headed TTY below:

$ ps
  PID TTY          TIME CMD
26841 pts/1    00:00:00 bash
26897 pts/1    00:00:00 ps

pts/1 is actually an abbreviation for /dev/pts/1, a character special file for the interactive shell's (PID 26841) controlling terminal:

$ ls -l /dev/pts/1
crw--w----    1 user tty      136,   1 Feb 21 00:35 /dev/pts/1

This is a bit of a peculiar thing, because normally character special files correspond to hardware devices such as serial ports. However, there is no hardware corresponding to /dev/pts/1; it is a "pseudo-terminal". When a terminal window is opened, a new pseudo-terminal will appear in /dev/pts that will disappear again once the window is closed. Pseudo-terminals are discussed in detail below, but for now they can be thought of as virtual serial ports that allow the interactive shell and the operating system to use the familiar user interface paradigm of a user sitting in front of a character-cell terminal connected to the computer by a serial port, even in situations that bear little resemblance to the one described. For the purposes of this book, then, a "terminal" will be any device that connects to a computer through an asynchronous serial port (real or virtual) and communicates with it using the ASCII character set.1

Asynchronous Serial Communications

From the operating system's point of view, the terminal is the serial port and its associated device driver. Therefore, in this chapter the operation of the hardware and the implementation of the device driver are examined in some detail starting from the voltages on the wires and moving up to a discussion of the protocols for grouping serial bits into characters.

Timing and voltages

The serial part of "asynchronous serial communications" is easy to explain: it refers to the fact that the data are represented by voltages driven serially on one physical wire. This is as opposed to a parallel interface, which would transmit bits on more than one physical wire simultaneously. To say that the voltages are driven serially simply means that to transmit a character the transmitter must assert the voltages corresponding to each bit of the character sequentially in time. It is up to the receiver to reassemble them into the original character.

When a 1 bit is driven onto the wire the corresponding voltage is referred to as a "mark", and the voltage corresponding to a 0 bit is a "space". The RS-232C standard specifies the following voltages

Mark Space
Transmit -5V to -15V +5V to +15V
Receive -3V to -25V +3V to +25V

by which it is meant, for example, that a standard-compliant line driver must drive a voltage between -5 and -15 volts to represent a mark, and a standard-compliant line receiver must recognize a voltage between -3 and -25 volts to represent a mark, etc.

If the transmitter and receiver were synchronized, then the receiver would know which bit is which simply by sampling the voltage on the wire at the times when bits are expected. This is, in fact, a fairly common way of doing serial communications, and synchronous serial ports are often used in telecommunications applications (for example, T1 lines). The important difference in asynchronous serial communications is that transmitter and receiver are only synchronized during a character, but the receiver can not determine when the transmitter will start the next character based on when the last one was sent.

Bits and Characters

When a transmitter is idle, it holds a mark on its output. That is, when a character is not being transmitted, the transmit pin is at a voltage of between -5 and -15 volts. Since the receiver is not synchronized with the transmitter, it does not know in advance when transmission of the next character will start. Therefore, asynchronous transmission of a character always begins with a "start bit", which the receiver uses as a reference point for timing the bits that follow. Since the idle channel is held at the voltage corresponding to a mark, the start bit is always a space so that it can be distinguished.

The bits that follow the start bit are the data bits. In order to reconstruct the character, the receiver must know the speed that the transmitter is sending data bits and the number of data bits per character (not always 8!). For example, if the receiver receives the start bit at time t, and the bit rate is r bits per second with n bits per character, then it can sample the voltage on the wire at t + i/r for i = 1, 2, 3 ... n to determine the corresponding bits of the character.

Since there are a number of ways that things could go wrong with such a simple scheme as the one just described, two additional measures are provided to make communications more robust: parity and stop bits.

Parity means an additional bit is generated before the first data bit of every character whose value is chosen so that the total number of ones transmitted always has the same parity, either even or odd. For example, the ASCII character 'A' is represented by the seven-bit binary number 1000001 which contains two ones. If even parity is chosen, then transmitted 'A' characters will be prefixed by a zero parity bit. If odd parity is chosen then transmitted 'A' characters will be prefixed by a one parity bit.2 Note that parity adds a one bit overhead to the communications channel which can be avoided by choosing to use no parity at all.

Stop bits are more subtle. As was described above, an idle channel is held in a marking condition and therefore the start bit is always a space. The stop bits, on the other hand, are always marks and therefore are indistiguishable from an idle channel. The standard allows for one, one and a half or two stop bits. What this amounts to is a choice of the minimum inter-character spacing, with one bit-time being the minimum and two bit-times the maximum. In the past, two stop bits were required by hardcopy terminals whose mechanisms would otherwise not be able to keep up with the data stream, but it is rare to use more than one stop bit today.

Setting Possible values
Data rate many
Data bits 5, 6, 7, 8
Stop bits 1, 1.5, 2
Parity even, odd

Bit errors

Things often go wrong on an asynchronous serial communications channel. However, it may be possible for the receiver to determine that an error has occurred in many situations. Three types of errors that are commonly detected are overrun, framing and parity.

An overrun means that the receiver's FIFO was full when a new character was received, forcing it to clobber existing data in order to store it. This is not a failure of the communications channel per se, but means that the computer isn't keeping up with the rate of data being received.

A framing error means that the receiver recognized a start bit, but did not find a valid stop bit when it was expected. This typically means that the intra-character timing was not synchronized between transmitter and receiver, possibly meaning that either the data rate or data bits settings do not match on transmitter and receiver.

A parity error simply means that the number of ones in a received character plus the parity bit did not match the setting specified.

Break

There is an "out of band" signal that can be transmitted by a serial port, namely, a break. A break is sent by holding the transmit wire in a spacing condition for a duration of at least two character times plus three bit times (as specified by the CCITT "blue book"). In other words, the transmitter holds the voltage corresponding to a zero bit for a length of time that is definitely longer than the transmission time for a single character. The receiver, therefore, can recognize a break since it receives more than a character's worth of spaces with no stop bit (which are always marks, see Bits and Characters) when one would be expected.

Normally, recognizing breaks is the responsibility of the UART (see The UART), but if for some reason (such as a limited UART that does not implement this functionality) the UART fails to do so, reception of a break will manifest itself as a large number of framing errors (see Bit Errors).

Breaks are used to signal an interrupt to the receiving process (see Terminal Signals).

Modem Control Signals

The serial communications scheme described so far only requires three wires to connect the transmitter and receiver: at either end there must be one wire for transmitting, one wire for receiving and one to carry a ground reference for the transmit and receive voltages. In practice, serial communications are often done with just these three wires. However, the standard provides for a number of ancillary signals for supporting the most common serial communications application: serial communications mediated by modems over the public telephone network. These are known collectively as the modem control signals.

The situation envisioned by the standard is that of a terminal or a computer (called "Data Terminal Equipment" or DTE) connected to a modem (called "Data Communications Equipment" or DCE) by an asynchronous serial channel. The modem is connected to the telephone network, and should be able to originate and/or receive telephone calls to/from other modems. The modem control signals are listed in the table below, and typically external modems have LEDs that display the condition of most of these signals at any time.

Acronym Full name Comes From
DTR Data Terminal Ready DTE
DSR Data Set Ready DCE
RTS Request To Send DTE
CTS Clear To Send DCE
DCD Data Carrier Detect DCE
RI Ring Indicator DCE

So the normal course of events when originating a call is

  1. DSR and CTS come on when the modem is powered up.
  2. DTR and RTS come on when the serial port in opened.
  3. DCD comes on when the number has been dialed and the remote modem has answered the call.

and when receiving a call

  1. DSR and CTS come on when the modem is powered up.
  2. DTR and RTS come on when the serial port is opened.
  3. RI comes on when the local telephone rings.
  4. DCD comes on when the local modem has answered the call.

Like the transmit and received data signals, the modem control signals are held at -5 to -15 volts when idle, and +5 to +15 volts when asserted. These are the same voltages used on the data lines (TXD and RXD) to represent mark and space, respectively (see Timing and Voltages). Therefore a binary zero and a control signal being asserted are both represented by the same voltage; in this sense the control signals can be said to use "complementary" or "negative" logic.

Pin Assignments

The inclusion of the modem control signals causes the number of wires to proliferate beyond the minimal three-wire configuration discussed above. In fact, most serial ports have either nine or 25 pins, with the signals assigned to pins as shown in the following table.

Signal Name Acronym 25-pin 9-pin
Protective ground 1 N/A
Transmitted data TXD 2 3
Received data RXD 3 2
Ready To Send RTS 4 7
Clear To Send CTS 5 8
Data Set Ready DSR 6 6
Signal ground SG 7 5
Data Carrier Detect DCD 8 1
Data Terminal Ready DTR 20 4
Ring Indicator RI 22 9

Note that the signal names are chosen from the perspective of a terminal (DTE); for example, a modem will transmit data to the terminal on RXD and receive data from the terminal on TXD.

Furthermore, if two DTEs are to be directly connected to each other without modems or telephones (such as the terminal and the Linux box in the introduction), the cable that connects them must interchange signals since, for example, both will want to transmit on pin 2 of a 25-pin connector and receive on pin 3. The cable must connect pin 2 of one connector to pin 3 of the other and vice versa; such a cable is called a "null modem" cable.

A null modem cable should also rearrange the modem control signals apropriately: RTS at one end should connect to CTS at the other, and DTR at one end should connect to DSR at the other. The DCD signal is usually connected to DSR at each end (and therefore to DTR at the opposite end).

TXD--\/--TXD
RXD--/\--RXD

RTS--\/--RTS
CTS--/\--CTS

DSR--+---DTR
DCD--|

DTR---+--DSR
      |--DCD

The UART

The hardware in a serial port that handles serial communications is called a UART, an acronym which stands for "Universal Asynchronous Receiver Transmitter". The UART is the device that the serial device driver drives. It is responsible for transmitting and receiving data as well as implementing all of the ancillary settings including bit rate, bits per character, parity, stop bits and modem control signals.

The UART does not directly interface with the pins on the serial port. It is usually designed to operate at much lower voltages (3.3 or 5) and is interfaced to the pins by line drivers and receivers (such as the DS1488 and DS1489) that convert between the RS-232C voltages and the UART voltage.

Important examples of UARTs include the Intel 8251, a very early UART that was used in Digital Equipment Corporation's VT100 terminals, and National Semiconductor's 16550A which is probably the most popular UART on the market today.

Operating System Support

So far, the discussion has not touched at all on the software needed to support an asynchronous serial port. In this chapter, the bottom up approach continues with a detailed look at the operating system components that provide user level processes access to the hardware. Two characters play prominent roles in this part of the story: the serial device driver and the GNU C Library.

Device Drivers

The serial device driver is a component of the operating system kernel (in Linux, it is frequently implemented as a loadable module), and implements the hardware-specific aspects of the software needed to use the serial port. For example, there are a variety of multiport serial cards available on the market, and no two will have the same register map unless they share a chipset. Therefore each will have a different device driver, all of which present the same interface to the operating system but perform different operations on the hardware.

A large part of any device driver is usually devoted to implementing "methods" that do the real work of system calls. System calls are the way the kernel provides services to user space programs in a safe and controlled way. When a system call is executed by a user space program, it invokes a "trap" causing the CPU to context switch into the privileged kernel mode and thereby gain access to registers, I/O ports and protected memory regions that are normally off-limits to user space programs. The kernel then passes (and possibly rearranges) the system call parameters to the device driver method that was registered for that call when the driver was loaded.3

This operation is fundamentally different from the execution of a library function. Execution of a library function does not imply a context switch nor a change in CPU privilege; it is in almost every way the same a executing a normal call to a function.4 However, most of the interesting functions in the GNU C Library are implemented by executing system calls.

Special Files

The Unix approach is to make an analogy between a hardware device such as a serial port and a file. A file is something that you can open, read, write and close with some expectation that the next time you open and read it you will find the last thing that was written there. Obviously this expectation cannot be extended to serial ports, but it still seems reasonable to read and write to them. Data read by a process from a serial port is what was received, and data written by a process to a serial port is transmitted.

Opening and closing a serial port is a bit more problematic. Since a serial port isn't really a file, it may seem a bit counterintuitive to open an entry in the filesystem in order to gain access to it. However, this is precisely what is done for hardware devices, except that the corresponding "special files" are conventionally found in the directory /dev.

Special files are created by the mknod(1) command and come in two flavors: block and character. Block devices can only be read from or written to one block (for example, 1024 bytes) at a time and are usually associated with disk drives, whereas character devices operate one character at a time. Since an asynchronous serial port always transmits or receives one character at a time, they are always character devices.

Special files also have major and minor numbers. The major number identifies to the operating system which device driver will service system calls on behalf of the special file,5 and the minor number identifies to the device driver one of the possibly several hardware devices that it manages on behalf of which it must act. For example, a multiport serial card would be managed by a single device driver but provides several serial ports. Each serial port would have a correpsonding special file with the same major number but a unique minor number. The minor number enables the device driver to distinguish which serial port has been opened, read from, written to, etc.

In fact, most serial ports are associated with two special files with different major but the same minor numbers. For example, both /dev/ttyS0 and /dev/cua0 are associated with the first UART serial port. Since device special files are a mechanism for extending the device/file analogy to the open(2) and close(2) system calls, different special files for the same device must correspond to different behaviors on one or both of these system calls. In fact, it is the open(2) system call that behaves differently depending on which of the two special files is opened.

Recall that the DCD modem control signal (see Modem Control Signals) is an indication to a DTE from a DCE that a dialed call has been completed and the opposite modem's carrier signal detected. If a program opens a serial port expecting to receive a call, it makes sense for the operating system to put that process to sleep until the modem asserts DCD since there is nothing for it to do before a call is established. On the other hand, if a program opens a serial port for purposes of placing a call, then it must have access to the modem before DCD is asserted in order to direct it to dial the number. This is precisely the difference between /dev/ttyS0 and /dev/cua0: a program that opens /dev/ttyS0 will block in the system call until DCD is asserted whereas it will not block if it opens /dev/cua0. For this reason, /dev/cua0 is often referred to as the "callout device."

Note that the blocking behavior of /dev/ttyS0 can be modified by setting the CLOCAL control mode flag, which causes the modem status lines to be ignored. This flag indicates that the terminal associated with the serial port is connected "locally", as opposed to remotely via a modem.

ioctl and termios

Special files are the method used to extend the analogy between files and devices to the open(2) and close(2) system calls; however, it is not at all clear how to express operations such as changing the bit rate of a serial port or enabling hardware flow control in terms of operations that can also be done on a file. In fact, an addiontal system call, ioctl(2) that exists for the purpose of providing a mechanism for implementing these sorts of ancillary "I/O Control" functions.

Because of the central role played by serial ports in user interaction, the serial device driver provides a much richer set of ioctl(2) settings than might be expected for a device as simple as the UART. In fact, many of these settings control things like character substitution and line buffering that have nothing to do with the UART per se.6 Nonetheless, a subset of the serial ioctl(2)s have a one-to-one correspondence with UART settings like bit rate, character size, parity, modem control signals, etc.

These days it is rare and considered bad form (for portability reasons) to use the ioctl(2) system call on a serial port directly. Instead, the GNU C Library provides a set of mediating functions defined by the POSIX standard and declared in the header file <termios.h>. Full details on these functions can be found in The GNU C Library Reference Manual.

The <termios.h> functions are passed a data structure called struct termios that contains the following members

              tcflag_t c_iflag;      /* input modes */
              tcflag_t c_oflag;      /* output modes */
              tcflag_t c_cflag;      /* control modes */
              tcflag_t c_lflag;      /* local modes */
              cc_t c_cc[NCCS];       /* control chars */

The first four members are bit masks in which programs can set and clear flags to specify certain behaviors. The UART settings can be directly mapped onto flags in the third bit mask, the control modes.

termios Control Modes

As an example of how to use <termios.h> functions to configure UART settings, let us see how the control modes settings are used to configure a serial port to operate at 9600 bps, with 8-bit characters, no parity and one stop bit. (This is a very common serial port setting, usually abbreviated to something like "9600 8n1".) Usually, 9600 bps is slow enough that flow control is not needed, either.

Because of the plethora of terminal settings available, the recommended method for changing them is to first read the current settings into a struct termios and then set and clear bits to make the necessary changes. That way, the settings that we haven't touched will remain at reasonable default values (hopefully). The following code reads the current settings for the serial port /dev/ttyS0 into a struct termios called tios.

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <termios.h>
#include <unistd.h>
#include <stdio.h>

int main(int argc, char *argv[])
{
  struct termios tios;
  int fd;

  fd = open("/dev/ttyS0", O_RDWR);
  if(fd < 0) {
    perror("open failed: ");
    return 1;
  }
  if(tcgetattr(fd, &tios) < 0) {
    perror("tcgetattr failed: ");
    return 2;
  }

From here, we go about changing the settings one at a time until we arrive at "9600 8n1".

Finally, we can apply our changes using the tcsetattr function

  tcsetattr(fd, TCSANOW, &tios);

Note that this function returns 0, indicating success, if any of the changes could be carried out.

termios Input/Output Modes

The control mode settings are the only ones that directly affect the UART hardware, but there are numerous other settings that control the way the serial device driver and kernel TTY layers will process I/O on the device. At the lowest software level, these settings control how parity errors and breaks are handled and character substitution.

Parity

If the hardware is configured to generate parity bits on transmitted characters and to check parity on received characters (i.e., the PARENB flag is set in the control modes, see ioctl and termios), the software must still decide what to do when parity errors are reported on received characters. There are four choices for how to handle characters containing parity errors: pass it through exactly as received (ignore the error), drop it (ignore the character), mark it as containing an error, or replace it with something else. These behaviors are controlled by flags in the terminal input modes, c_iflag, namely INPCK, IGNPAR and PARMRK.

  1. Ignore the error: disable input parity checking.
            tios.c_iflag &= ~INPCK;
    
    The UART will still generate parity bits on transmitted characters as long as the PARENB bit in the control modes is set, but input characters with parity errors will be passed on as they were received by the device driver.
  2. Ignore the character: enable input parity checking and ignore any character containing an error.
            tios.c_iflag |= (INPCK | IGNPAR);
            tios.c_iflag &= ~PARMRK;
    
    Any received characters containing parity errors will be silently dropped.
  3. Mark the character as containing an error.
            tios.c_iflag |= (INPCK | PARMRK);
            tios.c_iflag &= ~IGNPAR;
    
    This will cause a character containing a parity or framing error to be replaced by a three character sequence consiting of the erroneous charcter preceded by \377 and \000. A legitimate received \377 will be replaced by a pair of \377s.
  4. Replace the character with another
            tios.c_iflag |= INPCK;
            tios.c_iflag &= ~(IGNPAR | PARMRK);
    
    This will cause a character containing a parity or framing error to be replaced by the null character \000.

Carriage Return and Line Feed

The character substitutions provided by the low-level input/output modes are there in part to deal with the endless confusion over how lines of text should be terminated. Although one might think that ASCII text is a well-defined standard file format, it leaves unspecified what character or character-sequence should mark the end of a line. There are three commonly used conventions composed of two characters: carriage return (ASCII octal code \015, <Ctrl>-M, '\r') and line feed (ASCII octal code \012, <Ctrl>-J, '\n'). Macintosh computers use a single carriage return, VMS, DOS and Windows computers use a carriage return followed by a line feed, and Unix computers use a single line feed.

One could think of an ASCII text file as being a transcript of a terminal session. In particular, a command like

$ cat file.txt >/dev/ttyS0

should produce readable output on the terminal attached to /dev/ttyS0 and a command like

$ cat /dev/ttyS0 >file.txt

should capture all of the input from the terminal in the file file.txt. Typically, real character-cell terminals will treat received carriage return and line feed characters literally: a carriage return moves the cursor to the leftmost column without altering its row position and a line feed advances the cursor to the next row without altering its column position. Therefore, in order for the first command above to produce the expected output on the terminal screen, the lines in file.txt should be terminated by both a carriage return and a line feed (the DOS convention). However, real terminals typically do not transmit more than one ASCII character per key typed, and the <Enter> or <Return> key will usually cause it to transmit a single carriage return. Therefore, the second command above will create a file.txt with lines terminated by a single carriage return (the Macintosh convention). It could be said that terminal output follows the DOS convention and terminal input follows the Macintosh covention.

There are four flags available in the struct termios low-level input and output modes c_iflag and c_oflag for coping with this mess, namely IGNCR, ICRNL, INLCR and ONLCR.

  1. IGNCR: ignore carriage return. If this flag is set, carriage returns are discarded on input. This is only useful if the attached terminal transmits both carriage return and line feed characters when the <Return> key is pressed (rare).
  2. ICRNL: replace input carriage returns with line feeds. This is useful if the attached terminal transmits a single carriage return character when the <Return> key is pressed (the most common case). The lines received by the process reading from the serial port will be terminated by a single line feed character ('\n'), which is again the Unix convention.
  3. INLCR: replace input line feeds with carriage returns. This is the inverse of the ICRNL flag, and rarely used.
  4. ONLCR: replace output line feeds with carriage return and line feed. This will cause a line of text terminated by a single line feed character to display properly on a terminal that moves its cursor as described above.

Therefore, the most common character substitution configuration is to set the flags ICRNL and ONLCR and leave the rest cleared.

    tios.c_iflag &= ~(IGNCR | INLCR);
    tios.c_iflag |= ICRNL;
    tios.c_oflag |= ONLCR;
This way, output lines of text terminated according to the Unix convention (a single '\n') will be transmitted to the terminal as lines terminated by a carriage return and line feed ('\r' followed by '\n') and will display properly. Furthermore, input lines of text terminated by a single carriage return ('\r') will be read by a process as lines terminated by a single line feed ('\n'), which is the Unix convention.

Note that there is no combination of these flags that will just replace output line feeds with carriage returns (without a following line feed). You might expect that ONLCR is exactly complementary to ICRNL, but the latter replaces a single character ('\r') with a single character ('\n') on input, whereas the former replaces a single character ('\n') with two characters ('\r' and '\n') on output.

Stripping 8-bit characters

The ASCII character set contains only 128 distinct characters: it fits entirely into seven bits. Nonetheless, when reading or writing data to a terminal device, characters are always padded so that every character occupies exactly one byte, irregardless of the configured number of bits per character (see Bits and Characters). These settings only affect the number of bits that are put on the wire by the UART; the operating system must still provide the UART with one byte per character transmitted and the user program with one byte per character received.

Since the ASCII character set fits in seven bits, but a program reading from a terminal device reads one byte (eight bits) per character, it is sometimes useful to be able to strip the eighth, most significant bit off of input characters (in practice, this means setting it to zero). The input mode flag ISTRIP is provided for precisely this purpose:

  tios.c_iflag |= ISTRIP;

The most likely scenario where this flag is useful is if the communication channel is configured for parity and seven bits per character. In this case, the eigth bit on every received character is a parity bit, not part of the data payload. The user program does not need to know the value of the parity bit to check for parity errors; the device driver will check parity automatically if the PARENB control mode flag and the INPCK input mode flag are set (see termios Input and Output Modes).

Breaks

The single out of band signal supported by most UARTs is the break (see Break). Transmission of a break is mean to signal an exceptional condition, and a program reading from a serial port can choose how it wants to deal with these sorts of exceptions. Two input mode flags are provided to configure how breaks are handled, namely IGNBRK and BRKINT. If IGNBRK is set, then break conditions are ignored. If IGNBRK is cleared and BRKINT is set then a SIGINT signal will be sent to the foreground process group associated with the terminal (see Sessions groups processes).

(experiment: what if both IGNBRK and BRKINT clear?)

Software flow control

Two ASCII characters can be set aside for purposes of flow control. Traditionally, these are <Ctrl>-s (ASCII DC3, '\023') and <Ctrl>-q (ASCII DC1, '\021'), but the actual values at any time can be accessed by reading or writing the values of tios.c_cc[VSTOP] and tios.c_cc[VSTART], respectively. If the IXON input mode flag is set, then the device driver will suspend output to the terminal device (by putting the process to sleep the next time it tries to write to the device) when it inputs (receives) the STOP character on input. Output is resumed (by waking up the process) when the device driver receives the START character on input. Since the START/STOP characters control the flow of output, this is called "output flow control".

Conversely, if the IXOFF input mode flag is set, then the device driver will output (transmit) the STOP character if the input buffer is filling faster than programs are reading it, and output the START character once enough space becomes available. In this case, it is the responsibility of the hardware connected to the serial port (e.g. a terminal) to stop transmitting data when it receives the STOP character until it receives a START. Here, then, the START/STOP characters control the flow of input, and therefore this is called "input flow control".

termios Local Modes

The high-level input processing settings available in tios.c_lflag mask are mostly to control the ways that characters are echoed. In this context, echo means that the device driver will transmit every character received back to the terminal connected to the serial port. This is the preferred setting, unless the terminal itself has a "local echo" enabled. There are some additional wrinkles to this behavior having to do with how erase characters are echoed that are rarely used these days since they are intended for hardcopy terminals. However, there is one very important flag, ICANON, that deserves a more detailed treatment.

Canonical Input Processing

Recall the model of a character-cell terminal connected to a serial port providing a user with an interactive shell. The user types command lines, and the shell executes them. The role of the operating system kernel is to deliver the command lines to the shell and their output to the terminal. Typically, the shell process is sleeping in a blocking read, waiting for input from the user. The kernel will receive an interrupt for every character typed by the user. It could pass the characters on to the shell one at a time as they are typed, but this is inefficient because the shell works with input one line at a time (i.e. it doesn't do anything until you hit return), and there is a fairly high probability that any line of input will contain at least one correction (remember, the backspace character is also an ASCII character, \010, and is treated like any other by the serial port hardware). So if the device driver itself could buffer the input into lines, which implies that it must also handle corrections, then the shell process can be left sleeping until the input is really ready for it to do something with it. This greatly reduces the amount of context-switching.

This is precisely the meaning of "canonical input processing": if the serial port is configured for canonical input processing (i.e. the ICANON bit is set in the local modes, tios.c_lflag) then the device driver will buffer input into lines and a process reading from the serial port will block until a carriage-return is read. (XXX: experiment with clearing icrnl). Since this implies that the device driver will also have to handle the correction characters (backspace, word erase, line erase, etc.), there are a number of additional settings available to control how this will be done, and in particular how erasures will be echoed to the terminal: ECHO, ECHOE, ECHOPRT, ECHOK, ECHOKE and ECHONL. The usual default settings for these flags (with canonical input processing) are

    tios.c_lflag |= (ICANON | ECHO | ECHOE | ECHOK | ECHOKE
                     | ECHONL);
    tios.c_lflag &= ~(ECHOPRT);
In practice, these flags should only deviate from the defaults if the device connected to the serial port is not a terminal, or if it is a hardcopy (printing) terminal. For complete details, see The GNU C Library Reference Manual.

Non-Canonical Input Processing

Since an interactive shell is not the only process that might read from a serial port, there has to be some provision for running a serial port without canonical input processing. The important question is one of buffering input. In canonical mode, input is buffered one line at a time, in non-canonical modes the amount of buffering is determined by two parameters in the termios structure: tios.c_cc[VMIN] and tios.c_cc[VTIME]. The notion is that VMIN specifies the minimum number of characters to buffer and VTIME the maximum amount of inter-character time (in tenths of a second) before the device driver returns the buffered input to a process reading from the serial port. The precise interactions between the two parameters are subtle, once again full details are found in The GNU C Library Reference Manual.

stty

The stty(1) command provides access to the termios settings from the command shell. The easy way to find the current termios settings for the controlling terminal of a running login shell is to type the command stty -a, which produces output similar to the following:

$ stty -a
speed 38400 baud; rows 24; columns 80; line = 0;
intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D;
eol = <undef>; eol2 = <undef>; start = ^Q; stop = ^S;
susp = ^Z; rprnt = ^R; werase = ^W; lnext = ^V; flush = ^O;
min = 1; time = 0;
-parenb -parodd cs8 -hupcl -cstopb cread -clocal -crtscts
-ignbrk -brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr
icrnl ixon -ixoff -iuclc -ixany -imaxbel
opost -olcuc -ocrnl onlcr -onocr -onlret -ofill -ofdel nl0
cr0 tab0 bs0 vt0 ff0
isig icanon iexten echo echoe echok -echonl -noflsh -xcase
-tostop -echoprt echoctl echoke
For the most part, all of the settings shown above exist in one-to-one correspondence to termios settings. The speed setting is obvious. Settings prefixed with a - in the list are negated.

Accessing the Modem Control Signals

Getting the modem control signals.

    int flags;
    ioctl(fd, TIOCMGET, &flags );
    if ( flags & TIO_F_RTS ) puts(" RTS");
    if ( flags & TIO_F_CTS ) puts(" CTS");
    if ( flags & TIO_F_DSR ) puts(" DSR");
    if ( flags & TIO_F_DTR ) puts(" DTR");
    if ( flags & TIO_F_DCD ) puts(" DCD");
    if ( flags & TIO_F_RI  ) puts(" RI");
TIOCMSET

Serial ports and getty

init, getty and login

Alternatives to getty: agetty

Modem gettys: mgetty and vgetty

Sessions, groups and processes

Controlling terminals and daemons

Terminal Signals

SIGTTIN SIGTTOU SIGWINCH SIGTSTP

Serial consoles

Console cannot be a controlling tty.

Linux virtual terminals and mingetty

openvt(1), doshell(8) and switchto(1) commands

Major types of terminals

DEC VT100, Tektronix 4014, IBM 3270

The TERM environment variable

Terminal independence 1: termcap and terminfo

Terminal independence 2: curses and ncurses

Pseudo-terminals

telnet: the network virtual terminal

The xterm terminal emulator

X terminals

Resources

Concept Index

Table of Contents


Footnotes

  1. There are devices that are called terminals but fail to meet either of these criteria. Notable examples are the venerable IBM 3270 series of terminals which connect to an SNA network (usually a 3174 establishment controller) via coaxial connectors and speak EBCDIC. These differences reflect the very different philosophy of mainframe computing as opposed to the minicomputer environment of Unix and VMS. When minicomputers were introduced by Digital Equipment Corporation, the concept of a user terminal generating an interrupt on the main CPU for every character typed was revolutionary; contemporaneously, IBM was using specialized equipment (establishment controllers, communications controllers, front-end processors, etc.) to offload communications functions from the mainframe. IBM's terminals operated one page at a time; DEC's terminals operated one character at a time. This could be the subject of a very interesting article, but it is beyond the scope of this one, so let us stick to our original working definition of a terminal and leave it to the pedants to figure out where these other beasts fit in the terminal taxonomy.

  2. There are two more possible parity settings, rarely seen, called "mark" and "space" parity. If one of these settings is chosen, the transmitted parity bit is always a mark or a space, respectively.

  3. The Linux system calls are declared in the file /usr/include/linux/syscall.h. The assembly traps are found in /usr/src/linux/arch/i386/entry.S

  4. The only difference that comes to mind is that most libraries are dynamically loaded and the code is shared by all processes using the library simultaneously.

  5. Since major numbers identify device drivers, there has to be a convention specifying which number corresponds to each device driver. This can be found in the file /usr/src/linux/Documentation/devices.txt.

  6. It could reasonably be argued that the purpose of pseudo-terminals, virtual terminals that do not have physical UARTs and do not modify their behavior in response to ioctl(2)s changing UART settings, is to provide these non-UART settings and behaviors to interactive programs like the shell in situations when there is no real UART involved such as a network login or X Windows session. See Pseudo-terminals.

  7. In fact, input and output speeds can be set independently to unequal values using the cfsetispeed and cfsetospeed functions, but this is rarely done in practice.

  8. The GNU C Library actually allows input (RTS) and output (CTS) flow control to be independently controlled using the CCTS_OFLOW and CRTS_IFLOW flags.