Although it is rarely done today, one can connect a video terminal
with a null modem cable to a serial port on a Linux box and get an
interactive shell with very little effort. All that is required is
that you add a line such as
dt:12345:respawn:/sbin/agetty -L ttyS0 9600 vt510
to /etc/inittab, and then run telinit q as root to
get init to re-read the /etc/inittab file, and you
will be greeted by a login prompt on the terminal screen.
The ease with which we can do this task that we almost never want to do is due to the long legacy of Unix that Linux inherited. When Unix was young, video terminals were the user interface of choice and became a very important concept in the design of the operating system. Today, although "glass teletypes" are becoming more and more rare, the terminal has remained at the heart of the operating system as a very powerful metaphor.
Most computer users sophisticated (or perhaps old) enough to know what a character cell terminal is probably think of them as relics of the past before the GUI revolutionized the user interface. Nonetheless, the first thing almost all Unix users do immediately after logging in to their elaborate desktops (and watching the dazzling eye candy meant to distract them from the intolerably long wait until the desktop system is ready to do something useful) is to throw up a terminal window to get an interactive shell into which they can type commands. Many probably think of it as just an interactive shell and not a terminal window, but in fact in Unix you cannot have an interactive shell without a controlling terminal.
The controlling terminal is one of the properties listed by the
ps command, in the column headed TTY below:
$ ps PID TTY TIME CMD 26841 pts/1 00:00:00 bash 26897 pts/1 00:00:00 ps
pts/1 is actually an abbreviation for /dev/pts/1, a
character special file for the interactive shell's (PID 26841)
controlling terminal:
$ ls -l /dev/pts/1 crw--w---- 1 user tty 136, 1 Feb 21 00:35 /dev/pts/1
This is a bit of a peculiar thing, because normally character special
files correspond to hardware devices such as serial ports. However,
there is no hardware corresponding to /dev/pts/1; it is a
"pseudo-terminal". When a terminal window is opened, a new
pseudo-terminal will appear in /dev/pts that will disappear
again once the window is closed. Pseudo-terminals are discussed in
detail below, but for now they can be thought of as virtual serial
ports that allow the interactive shell and the operating system to use
the familiar user interface paradigm of a user sitting in front of a
character-cell terminal connected to the computer by a serial port,
even in situations that bear little resemblance to the one described.
For the purposes of this book, then, a "terminal" will be any device
that connects to a computer through an asynchronous serial port (real
or virtual) and communicates with it using the ASCII character
set.1
From the operating system's point of view, the terminal is the serial port and its associated device driver. Therefore, in this chapter the operation of the hardware and the implementation of the device driver are examined in some detail starting from the voltages on the wires and moving up to a discussion of the protocols for grouping serial bits into characters.
The serial part of "asynchronous serial communications" is easy to explain: it refers to the fact that the data are represented by voltages driven serially on one physical wire. This is as opposed to a parallel interface, which would transmit bits on more than one physical wire simultaneously. To say that the voltages are driven serially simply means that to transmit a character the transmitter must assert the voltages corresponding to each bit of the character sequentially in time. It is up to the receiver to reassemble them into the original character.
When a 1 bit is driven onto the wire the corresponding voltage is referred to as a "mark", and the voltage corresponding to a 0 bit is a "space". The RS-232C standard specifies the following voltages
| Mark | Space
| |
| Transmit | -5V to -15V | +5V to +15V
|
| Receive | -3V to -25V | +3V to +25V
|
by which it is meant, for example, that a standard-compliant line driver must drive a voltage between -5 and -15 volts to represent a mark, and a standard-compliant line receiver must recognize a voltage between -3 and -25 volts to represent a mark, etc.
If the transmitter and receiver were synchronized, then the receiver would know which bit is which simply by sampling the voltage on the wire at the times when bits are expected. This is, in fact, a fairly common way of doing serial communications, and synchronous serial ports are often used in telecommunications applications (for example, T1 lines). The important difference in asynchronous serial communications is that transmitter and receiver are only synchronized during a character, but the receiver can not determine when the transmitter will start the next character based on when the last one was sent.
When a transmitter is idle, it holds a mark on its output. That is, when a character is not being transmitted, the transmit pin is at a voltage of between -5 and -15 volts. Since the receiver is not synchronized with the transmitter, it does not know in advance when transmission of the next character will start. Therefore, asynchronous transmission of a character always begins with a "start bit", which the receiver uses as a reference point for timing the bits that follow. Since the idle channel is held at the voltage corresponding to a mark, the start bit is always a space so that it can be distinguished.
The bits that follow the start bit are the data bits. In order to reconstruct the character, the receiver must know the speed that the transmitter is sending data bits and the number of data bits per character (not always 8!). For example, if the receiver receives the start bit at time t, and the bit rate is r bits per second with n bits per character, then it can sample the voltage on the wire at t + i/r for i = 1, 2, 3 ... n to determine the corresponding bits of the character.
Since there are a number of ways that things could go wrong with such a simple scheme as the one just described, two additional measures are provided to make communications more robust: parity and stop bits.
Parity means an additional bit is generated before the first data bit of every character whose value is chosen so that the total number of ones transmitted always has the same parity, either even or odd. For example, the ASCII character 'A' is represented by the seven-bit binary number 1000001 which contains two ones. If even parity is chosen, then transmitted 'A' characters will be prefixed by a zero parity bit. If odd parity is chosen then transmitted 'A' characters will be prefixed by a one parity bit.2 Note that parity adds a one bit overhead to the communications channel which can be avoided by choosing to use no parity at all.
Stop bits are more subtle. As was described above, an idle channel is held in a marking condition and therefore the start bit is always a space. The stop bits, on the other hand, are always marks and therefore are indistiguishable from an idle channel. The standard allows for one, one and a half or two stop bits. What this amounts to is a choice of the minimum inter-character spacing, with one bit-time being the minimum and two bit-times the maximum. In the past, two stop bits were required by hardcopy terminals whose mechanisms would otherwise not be able to keep up with the data stream, but it is rare to use more than one stop bit today.
| Setting | Possible values
|
| Data rate | many
|
| Data bits | 5, 6, 7, 8
|
| Stop bits | 1, 1.5, 2
|
| Parity | even, odd
|
Things often go wrong on an asynchronous serial communications channel. However, it may be possible for the receiver to determine that an error has occurred in many situations. Three types of errors that are commonly detected are overrun, framing and parity.
An overrun means that the receiver's FIFO was full when a new character was received, forcing it to clobber existing data in order to store it. This is not a failure of the communications channel per se, but means that the computer isn't keeping up with the rate of data being received.
A framing error means that the receiver recognized a start bit, but did not find a valid stop bit when it was expected. This typically means that the intra-character timing was not synchronized between transmitter and receiver, possibly meaning that either the data rate or data bits settings do not match on transmitter and receiver.
A parity error simply means that the number of ones in a received character plus the parity bit did not match the setting specified.
There is an "out of band" signal that can be transmitted by a serial port, namely, a break. A break is sent by holding the transmit wire in a spacing condition for a duration of at least two character times plus three bit times (as specified by the CCITT "blue book"). In other words, the transmitter holds the voltage corresponding to a zero bit for a length of time that is definitely longer than the transmission time for a single character. The receiver, therefore, can recognize a break since it receives more than a character's worth of spaces with no stop bit (which are always marks, see Bits and Characters) when one would be expected.
Normally, recognizing breaks is the responsibility of the UART (see The UART), but if for some reason (such as a limited UART that does not implement this functionality) the UART fails to do so, reception of a break will manifest itself as a large number of framing errors (see Bit Errors).
Breaks are used to signal an interrupt to the receiving process (see Terminal Signals).
The serial communications scheme described so far only requires three wires to connect the transmitter and receiver: at either end there must be one wire for transmitting, one wire for receiving and one to carry a ground reference for the transmit and receive voltages. In practice, serial communications are often done with just these three wires. However, the standard provides for a number of ancillary signals for supporting the most common serial communications application: serial communications mediated by modems over the public telephone network. These are known collectively as the modem control signals.
The situation envisioned by the standard is that of a terminal or a computer (called "Data Terminal Equipment" or DTE) connected to a modem (called "Data Communications Equipment" or DCE) by an asynchronous serial channel. The modem is connected to the telephone network, and should be able to originate and/or receive telephone calls to/from other modems. The modem control signals are listed in the table below, and typically external modems have LEDs that display the condition of most of these signals at any time.
| Acronym | Full name | Comes From
|
| DTR | Data Terminal Ready | DTE
|
| DSR | Data Set Ready | DCE
|
| RTS | Request To Send | DTE
|
| CTS | Clear To Send | DCE
|
| DCD | Data Carrier Detect | DCE
|
| RI | Ring Indicator | DCE
|
So the normal course of events when originating a call is
and when receiving a call
Like the transmit and received data signals, the modem control signals are held at -5 to -15 volts when idle, and +5 to +15 volts when asserted. These are the same voltages used on the data lines (TXD and RXD) to represent mark and space, respectively (see Timing and Voltages). Therefore a binary zero and a control signal being asserted are both represented by the same voltage; in this sense the control signals can be said to use "complementary" or "negative" logic.
The inclusion of the modem control signals causes the number of wires to proliferate beyond the minimal three-wire configuration discussed above. In fact, most serial ports have either nine or 25 pins, with the signals assigned to pins as shown in the following table.
| Signal Name | Acronym | 25-pin | 9-pin
|
| Protective ground | 1 | N/A
| |
| Transmitted data | TXD | 2 | 3
|
| Received data | RXD | 3 | 2
|
| Ready To Send | RTS | 4 | 7
|
| Clear To Send | CTS | 5 | 8
|
| Data Set Ready | DSR | 6 | 6
|
| Signal ground | SG | 7 | 5
|
| Data Carrier Detect | DCD | 8 | 1
|
| Data Terminal Ready | DTR | 20 | 4
|
| Ring Indicator | RI | 22 | 9
|
Note that the signal names are chosen from the perspective of a terminal (DTE); for example, a modem will transmit data to the terminal on RXD and receive data from the terminal on TXD.
Furthermore, if two DTEs are to be directly connected to each other without modems or telephones (such as the terminal and the Linux box in the introduction), the cable that connects them must interchange signals since, for example, both will want to transmit on pin 2 of a 25-pin connector and receive on pin 3. The cable must connect pin 2 of one connector to pin 3 of the other and vice versa; such a cable is called a "null modem" cable.
A null modem cable should also rearrange the modem control signals
apropriately: RTS at one end should connect to CTS at the other, and
DTR at one end should connect to DSR at the other. The DCD signal is
usually connected to DSR at each end (and therefore to DTR at the
opposite end).
TXD--\/--TXD
RXD--/\--RXD
RTS--\/--RTS
CTS--/\--CTS
DSR--+---DTR
DCD--|
DTR---+--DSR
|--DCD
The hardware in a serial port that handles serial communications is called a UART, an acronym which stands for "Universal Asynchronous Receiver Transmitter". The UART is the device that the serial device driver drives. It is responsible for transmitting and receiving data as well as implementing all of the ancillary settings including bit rate, bits per character, parity, stop bits and modem control signals.
The UART does not directly interface with the pins on the serial port. It is usually designed to operate at much lower voltages (3.3 or 5) and is interfaced to the pins by line drivers and receivers (such as the DS1488 and DS1489) that convert between the RS-232C voltages and the UART voltage.
Important examples of UARTs include the Intel 8251, a very early UART that was used in Digital Equipment Corporation's VT100 terminals, and National Semiconductor's 16550A which is probably the most popular UART on the market today.
So far, the discussion has not touched at all on the software needed to support an asynchronous serial port. In this chapter, the bottom up approach continues with a detailed look at the operating system components that provide user level processes access to the hardware. Two characters play prominent roles in this part of the story: the serial device driver and the GNU C Library.
The serial device driver is a component of the operating system kernel (in Linux, it is frequently implemented as a loadable module), and implements the hardware-specific aspects of the software needed to use the serial port. For example, there are a variety of multiport serial cards available on the market, and no two will have the same register map unless they share a chipset. Therefore each will have a different device driver, all of which present the same interface to the operating system but perform different operations on the hardware.
A large part of any device driver is usually devoted to implementing "methods" that do the real work of system calls. System calls are the way the kernel provides services to user space programs in a safe and controlled way. When a system call is executed by a user space program, it invokes a "trap" causing the CPU to context switch into the privileged kernel mode and thereby gain access to registers, I/O ports and protected memory regions that are normally off-limits to user space programs. The kernel then passes (and possibly rearranges) the system call parameters to the device driver method that was registered for that call when the driver was loaded.3
This operation is fundamentally different from the execution of a library function. Execution of a library function does not imply a context switch nor a change in CPU privilege; it is in almost every way the same a executing a normal call to a function.4 However, most of the interesting functions in the GNU C Library are implemented by executing system calls.
The Unix approach is to make an analogy between a hardware device such as a serial port and a file. A file is something that you can open, read, write and close with some expectation that the next time you open and read it you will find the last thing that was written there. Obviously this expectation cannot be extended to serial ports, but it still seems reasonable to read and write to them. Data read by a process from a serial port is what was received, and data written by a process to a serial port is transmitted.
Opening and closing a serial port is a bit more problematic. Since a
serial port isn't really a file, it may seem a bit counterintuitive to
open an entry in the filesystem in order to gain access to it.
However, this is precisely what is done for hardware devices, except
that the corresponding "special files" are conventionally found in
the directory /dev.
Special files are created by the mknod(1) command and come
in two flavors: block and character. Block devices can only be read
from or written to one block (for example, 1024 bytes) at a time and
are usually associated with disk drives, whereas character devices
operate one character at a time. Since an asynchronous serial port
always transmits or receives one character at a time, they are always
character devices.
Special files also have major and minor numbers. The major number identifies to the operating system which device driver will service system calls on behalf of the special file,5 and the minor number identifies to the device driver one of the possibly several hardware devices that it manages on behalf of which it must act. For example, a multiport serial card would be managed by a single device driver but provides several serial ports. Each serial port would have a correpsonding special file with the same major number but a unique minor number. The minor number enables the device driver to distinguish which serial port has been opened, read from, written to, etc.
In fact, most serial ports are associated with two special
files with different major but the same minor numbers. For example,
both /dev/ttyS0 and /dev/cua0 are associated with the
first UART serial port. Since device special files are a mechanism
for extending the device/file analogy to the open(2) and
close(2) system calls, different special files for the same
device must correspond to different behaviors on one or both of these
system calls. In fact, it is the open(2) system call that
behaves differently depending on which of the two special files is
opened.
Recall that the DCD modem control signal (see Modem Control Signals) is an indication to a DTE from a DCE that a dialed call has
been completed and the opposite modem's carrier signal detected. If a
program opens a serial port expecting to receive a call, it
makes sense for the operating system to put that process to sleep
until the modem asserts DCD since there is nothing for it to do before
a call is established. On the other hand, if a program opens a serial
port for purposes of placing a call, then it must have access
to the modem before DCD is asserted in order to direct it to dial the
number. This is precisely the difference between /dev/ttyS0
and /dev/cua0: a program that opens /dev/ttyS0 will
block in the system call until DCD is asserted whereas it will not
block if it opens /dev/cua0. For this reason, /dev/cua0
is often referred to as the "callout device."
Note that the blocking behavior of /dev/ttyS0 can be modified
by setting the CLOCAL control mode flag, which causes the modem status
lines to be ignored. This flag indicates that the terminal associated
with the serial port is connected "locally", as opposed to remotely
via a modem.
ioctl and termiosSpecial files are the method used to extend the analogy between files
and devices to the open(2) and close(2) system calls;
however, it is not at all clear how to express operations such as
changing the bit rate of a serial port or enabling hardware flow
control in terms of operations that can also be done on a file. In
fact, an addiontal system call, ioctl(2) that exists for the
purpose of providing a mechanism for implementing these sorts of
ancillary "I/O Control" functions.
Because of the central role played by serial ports in user
interaction, the serial device driver provides a much richer set of
ioctl(2) settings than might be expected for a device as simple
as the UART. In fact, many of these settings control things like
character substitution and line buffering that have nothing to do with
the UART per se.6 Nonetheless, a
subset of the serial ioctl(2)s have a one-to-one correspondence
with UART settings like bit rate, character size, parity, modem
control signals, etc.
These days it is rare and considered bad form (for portability
reasons) to use the ioctl(2) system call on a serial port
directly. Instead, the GNU C Library provides a set of mediating
functions defined by the POSIX standard and declared in the header
file <termios.h>. Full details on these functions can be
found in
The GNU C Library Reference Manual.
The <termios.h> functions are passed a data structure called
struct termios that contains the following members
tcflag_t c_iflag; /* input modes */
tcflag_t c_oflag; /* output modes */
tcflag_t c_cflag; /* control modes */
tcflag_t c_lflag; /* local modes */
cc_t c_cc[NCCS]; /* control chars */
The first four members are bit masks in which programs can set and clear flags to specify certain behaviors. The UART settings can be directly mapped onto flags in the third bit mask, the control modes.
termios Control ModesAs an example of how to use <termios.h> functions to configure
UART settings, let us see how the control modes settings are used to
configure a serial port to operate at 9600 bps, with 8-bit characters,
no parity and one stop bit. (This is a very common serial port
setting, usually abbreviated to something like "9600 8n1".)
Usually, 9600 bps is slow enough that flow control is not needed,
either.
Because of the plethora of terminal settings available, the
recommended method for changing them is to first read the current
settings into a struct termios and then set and clear bits
to make the necessary changes. That way, the settings that we haven't
touched will remain at reasonable default values (hopefully). The
following code reads the current settings for the serial port
/dev/ttyS0 into a struct termios called tios.
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <termios.h>
#include <unistd.h>
#include <stdio.h>
int main(int argc, char *argv[])
{
struct termios tios;
int fd;
fd = open("/dev/ttyS0", O_RDWR);
if(fd < 0) {
perror("open failed: ");
return 1;
}
if(tcgetattr(fd, &tios) < 0) {
perror("tcgetattr failed: ");
return 2;
}
From here, we go about changing the settings one at a time until we arrive at "9600 8n1".
Unlike the other settings show below, the bit rate is not set by
directly setting and clearing flags in tios.c_cflag. Instead,
a function cfsetspeed is provided to diddle the bits for
us.7
cfsetspeed(&tios, B9600);
Note that we use the preprocessor symbol B9600, not the
number 9600.
There are four possible settings for the number of bits per character:
5, 6, 7 and 8 (see Bits and Characters). The GNU C Library
defines five macros for accessing the corresponding bits in the
control modes bitmask, namely CS5, CS6, CS7,
CS8 and CSIZE. To set the number of bits per character
you first clear all of the bits with the CSIZE macro and then
set the bits corresponding to the size you want. To set eight bits
per character
tios.c_cflag &= ~CSIZE; tios.c_cflag |= CS8;
There are three possible settings for parity: even, odd and none
(see Bits and Characters). The GNU C Library defines two macros
for accessing the corresponding bits in the control modes bitmask,
namely PARENB and PARODD. The former enables and
disable parity, the latter chooses odd parity (if parity is enabled
and PARODD is cleared, then even parity is generated). Note
that these flags affect both transmitter and receiver; for example, if
both flags are set then odd parity will be generated by the
transmitter and if a received character has an even number of bits the
receiver will generate a parity error.
To configure for no parity
tios.c_cflag &= ~(PARENB | PARODD);
In the GNU C Library, there are two possible settings for the number
of stop bits: one and two (see Bits and Characters). If the
CSTOPB flag is set, two stop bits are used, and only one is
used if it is cleared. The following example shows how to configure a
serial port for one stop bit.
tios.c_cflag &= ~CSTOPB;
The flow control signals are the RTS and CTS modem control signals
that can be used to gate the flow of data in and out of a serial port
(see Modem Control Signals). Flow control can be enabled and
disabled by setting and clearing the CRTSCTS flag.8 The following example shows how to
configure a serial port for no flow control.
tios.c_cflag &= ~CRTSCTS;
Finally, we can apply our changes using the tcsetattr function
tcsetattr(fd, TCSANOW, &tios);
Note that this function returns 0, indicating success, if any of the changes could be carried out.
termios Input/Output ModesThe control mode settings are the only ones that directly affect the UART hardware, but there are numerous other settings that control the way the serial device driver and kernel TTY layers will process I/O on the device. At the lowest software level, these settings control how parity errors and breaks are handled and character substitution.
If the hardware is configured to generate parity bits on transmitted
characters and to check parity on received characters (i.e., the
PARENB flag is set in the control modes, see ioctl and termios), the software must still decide what to do when parity
errors are reported on received characters. There are four choices
for how to handle characters containing parity errors: pass it through
exactly as received (ignore the error), drop it (ignore the
character), mark it as containing an error, or replace it with
something else. These behaviors are controlled by flags in the
terminal input modes, c_iflag, namely INPCK,
IGNPAR and PARMRK.
tios.c_iflag &= ~INPCK;The UART will still generate parity bits on transmitted characters as long as the
PARENB bit in the control modes is set, but input
characters with parity errors will be passed on as they were received
by the device driver.
tios.c_iflag |= (INPCK | IGNPAR);
tios.c_iflag &= ~PARMRK;
Any received characters containing parity errors will be silently
dropped.
tios.c_iflag |= (INPCK | PARMRK);
tios.c_iflag &= ~IGNPAR;
This will cause a character containing a parity or framing error to be
replaced by a three character sequence consiting of the erroneous
charcter preceded by \377 and \000. A legitimate
received \377 will be replaced by a pair of \377s.
tios.c_iflag |= INPCK;
tios.c_iflag &= ~(IGNPAR | PARMRK);
This will cause a character containing a parity or framing error to be
replaced by the null character \000.
The character substitutions provided by the low-level input/output
modes are there in part to deal with the endless confusion over how
lines of text should be terminated. Although one might think that
ASCII text is a well-defined standard file format, it leaves
unspecified what character or character-sequence should mark the end
of a line. There are three commonly used conventions composed of two
characters: carriage return (ASCII octal code \015,
<Ctrl>-M, '\r') and line feed (ASCII octal code
\012, <Ctrl>-J, '\n'). Macintosh computers
use a single carriage return, VMS, DOS and Windows computers use a
carriage return followed by a line feed, and Unix computers use a
single line feed.
One could think of an ASCII text file as being a transcript of a
terminal session. In particular, a command like
$ cat file.txt >/dev/ttyS0
should produce readable output on the terminal attached to
/dev/ttyS0 and a command like
$ cat /dev/ttyS0 >file.txt
should capture all of the input from the terminal in the file
file.txt. Typically, real character-cell terminals will treat
received carriage return and line feed characters literally: a
carriage return moves the cursor to the leftmost column without
altering its row position and a line feed advances the cursor to the
next row without altering its column position. Therefore, in order
for the first command above to produce the expected output on the
terminal screen, the lines in file.txt should be terminated by
both a carriage return and a line feed (the DOS convention). However,
real terminals typically do not transmit more than one ASCII character
per key typed, and the <Enter> or <Return> key will usually
cause it to transmit a single carriage return. Therefore, the second
command above will create a file.txt with lines terminated by a
single carriage return (the Macintosh convention). It could be said
that terminal output follows the DOS convention and terminal input
follows the Macintosh covention.
There are four flags available in the struct termios low-level
input and output modes c_iflag and c_oflag for coping
with this mess, namely IGNCR, ICRNL, INLCR and
ONLCR.
IGNCR: ignore carriage return. If this flag is set,
carriage returns are discarded on input. This is only useful if the
attached terminal transmits both carriage return and line feed
characters when the <Return> key is pressed (rare).
ICRNL: replace input carriage returns with line feeds.
This is useful if the attached terminal transmits a single carriage
return character when the <Return> key is pressed (the most
common case). The lines received by the process reading from the
serial port will be terminated by a single line feed character
('\n'), which is again the Unix convention.
INLCR: replace input line feeds with carriage returns.
This is the inverse of the ICRNL flag, and rarely used.
ONLCR: replace output line feeds with carriage return
and line feed. This will cause a line of text terminated by a single
line feed character to display properly on a terminal that moves its
cursor as described above.
Therefore, the most common character substitution configuration is to
set the flags ICRNL and ONLCR and leave the rest
cleared.
tios.c_iflag &= ~(IGNCR | INLCR);
tios.c_iflag |= ICRNL;
tios.c_oflag |= ONLCR;
This way, output lines of text terminated according to the Unix
convention (a single '\n') will be transmitted to the terminal
as lines terminated by a carriage return and line feed ('\r'
followed by '\n') and will display properly. Furthermore,
input lines of text terminated by a single carriage return
('\r') will be read by a process as lines terminated by a
single line feed ('\n'), which is the Unix convention.
Note that there is no combination of these flags that will just
replace output line feeds with carriage returns (without a following
line feed). You might expect that ONLCR is exactly
complementary to ICRNL, but the latter replaces a single
character ('\r') with a single character ('\n') on
input, whereas the former replaces a single character ('\n')
with two characters ('\r' and '\n') on output.
The ASCII character set contains only 128 distinct characters: it fits entirely into seven bits. Nonetheless, when reading or writing data to a terminal device, characters are always padded so that every character occupies exactly one byte, irregardless of the configured number of bits per character (see Bits and Characters). These settings only affect the number of bits that are put on the wire by the UART; the operating system must still provide the UART with one byte per character transmitted and the user program with one byte per character received.
Since the ASCII character set fits in seven bits, but a program
reading from a terminal device reads one byte (eight bits) per
character, it is sometimes useful to be able to strip the eighth, most
significant bit off of input characters (in practice, this means
setting it to zero). The input mode flag ISTRIP is provided
for precisely this purpose:
tios.c_iflag |= ISTRIP;
The most likely scenario where this flag is useful is if the
communication channel is configured for parity and seven bits per
character. In this case, the eigth bit on every received character is
a parity bit, not part of the data payload. The user program does not
need to know the value of the parity bit to check for parity errors;
the device driver will check parity automatically if the PARENB
control mode flag and the INPCK input mode flag are set
(see termios Input and Output Modes).
The single out of band signal supported by most UARTs is the break
(see Break). Transmission of a break is mean to signal an
exceptional condition, and a program reading from a serial port can
choose how it wants to deal with these sorts of exceptions. Two input
mode flags are provided to configure how breaks are handled, namely
IGNBRK and BRKINT. If IGNBRK is set, then break
conditions are ignored. If IGNBRK is cleared and BRKINT
is set then a SIGINT signal will be sent to the foreground
process group associated with the terminal (see Sessions groups processes).
(experiment: what if both IGNBRK and BRKINT clear?)
Two ASCII characters can be set aside for purposes of flow control.
Traditionally, these are <Ctrl>-s (ASCII DC3, '\023')
and <Ctrl>-q (ASCII DC1, '\021'), but the actual
values at any time can be accessed by reading or writing the values of
tios.c_cc[VSTOP] and tios.c_cc[VSTART], respectively.
If the IXON input mode flag is set, then the device driver will
suspend output to the terminal device (by putting the process to sleep
the next time it tries to write to the device) when it inputs
(receives) the STOP character on input. Output is resumed (by waking
up the process) when the device driver receives the START character on
input. Since the START/STOP characters control the flow of output,
this is called "output flow control".
Conversely, if the IXOFF input mode flag is set, then the
device driver will output (transmit) the STOP character if the input
buffer is filling faster than programs are reading it, and output the
START character once enough space becomes available. In this case, it
is the responsibility of the hardware connected to the serial port
(e.g. a terminal) to stop transmitting data when it receives the STOP
character until it receives a START. Here, then, the START/STOP
characters control the flow of input, and therefore this is called
"input flow control".
termios Local ModesThe high-level input processing settings available in
tios.c_lflag mask are mostly to control the ways that
characters are echoed. In this context, echo means that the device
driver will transmit every character received back to the terminal
connected to the serial port. This is the preferred setting, unless
the terminal itself has a "local echo" enabled. There are some
additional wrinkles to this behavior having to do with how erase
characters are echoed that are rarely used these days since they are
intended for hardcopy terminals. However, there is one very important
flag, ICANON, that deserves a more detailed treatment.
Recall the model of a character-cell terminal connected to a serial
port providing a user with an interactive shell. The user types
command lines, and the shell executes them. The role of the operating
system kernel is to deliver the command lines to the shell and their
output to the terminal. Typically, the shell process is sleeping in a
blocking read, waiting for input from the user. The kernel will
receive an interrupt for every character typed by the user. It could
pass the characters on to the shell one at a time as they are typed,
but this is inefficient because the shell works with input one line at
a time (i.e. it doesn't do anything until you hit return), and there
is a fairly high probability that any line of input will contain at
least one correction (remember, the backspace character is also an
ASCII character, \010, and is treated like any other by the
serial port hardware). So if the device driver itself could buffer
the input into lines, which implies that it must also handle
corrections, then the shell process can be left sleeping until the
input is really ready for it to do something with it. This greatly
reduces the amount of context-switching.
This is precisely the meaning of "canonical input processing": if
the serial port is configured for canonical input processing (i.e. the
ICANON bit is set in the local modes, tios.c_lflag) then
the device driver will buffer input into lines and a process reading
from the serial port will block until a carriage-return is read. (XXX:
experiment with clearing icrnl). Since this implies that the device
driver will also have to handle the correction characters (backspace,
word erase, line erase, etc.), there are a number of additional
settings available to control how this will be done, and in particular
how erasures will be echoed to the terminal: ECHO,
ECHOE, ECHOPRT, ECHOK, ECHOKE and
ECHONL. The usual default settings for these flags (with
canonical input processing) are
tios.c_lflag |= (ICANON | ECHO | ECHOE | ECHOK | ECHOKE
| ECHONL);
tios.c_lflag &= ~(ECHOPRT);
In practice, these flags should only deviate from the defaults if
the device connected to the serial port is not a terminal, or if it is
a hardcopy (printing) terminal. For complete details, see
The GNU C Library Reference Manual.
Since an interactive shell is not the only process that might read
from a serial port, there has to be some provision for running a
serial port without canonical input processing. The important
question is one of buffering input. In canonical mode, input is
buffered one line at a time, in non-canonical modes the amount of
buffering is determined by two parameters in the termios
structure: tios.c_cc[VMIN] and tios.c_cc[VTIME]. The
notion is that VMIN specifies the minimum number of characters
to buffer and VTIME the maximum amount of inter-character time
(in tenths of a second) before the device driver returns the buffered
input to a process reading from the serial port. The precise
interactions between the two parameters are subtle, once again full
details are found in
The GNU C Library Reference Manual.
sttyThe stty(1) command provides access to the termios
settings from the command shell. The easy way to find the current
termios settings for the controlling terminal of a running login shell
is to type the command stty -a, which produces output
similar to the following:
$ stty -a speed 38400 baud; rows 24; columns 80; line = 0; intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = <undef>; eol2 = <undef>; start = ^Q; stop = ^S; susp = ^Z; rprnt = ^R; werase = ^W; lnext = ^V; flush = ^O; min = 1; time = 0; -parenb -parodd cs8 -hupcl -cstopb cread -clocal -crtscts -ignbrk -brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr icrnl ixon -ixoff -iuclc -ixany -imaxbel opost -olcuc -ocrnl onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0 bs0 vt0 ff0 isig icanon iexten echo echoe echok -echonl -noflsh -xcase -tostop -echoprt echoctl echokeFor the most part, all of the settings shown above exist in one-to-one correspondence to
termios settings. The speed setting is
obvious. Settings prefixed with a
- in the list are negated.
Getting the modem control signals.
int flags;
ioctl(fd, TIOCMGET, &flags );
if ( flags & TIO_F_RTS ) puts(" RTS");
if ( flags & TIO_F_CTS ) puts(" CTS");
if ( flags & TIO_F_DSR ) puts(" DSR");
if ( flags & TIO_F_DTR ) puts(" DTR");
if ( flags & TIO_F_DCD ) puts(" DCD");
if ( flags & TIO_F_RI ) puts(" RI");
TIOCMSET
SIGTTIN SIGTTOU SIGWINCH SIGTSTP
Console cannot be a controlling tty.
openvt(1), doshell(8) and switchto(1) commands
DEC VT100, Tektronix 4014, IBM 3270
ioctl and termios
There are devices that are called terminals but fail to meet either of these criteria. Notable examples are the venerable IBM 3270 series of terminals which connect to an SNA network (usually a 3174 establishment controller) via coaxial connectors and speak EBCDIC. These differences reflect the very different philosophy of mainframe computing as opposed to the minicomputer environment of Unix and VMS. When minicomputers were introduced by Digital Equipment Corporation, the concept of a user terminal generating an interrupt on the main CPU for every character typed was revolutionary; contemporaneously, IBM was using specialized equipment (establishment controllers, communications controllers, front-end processors, etc.) to offload communications functions from the mainframe. IBM's terminals operated one page at a time; DEC's terminals operated one character at a time. This could be the subject of a very interesting article, but it is beyond the scope of this one, so let us stick to our original working definition of a terminal and leave it to the pedants to figure out where these other beasts fit in the terminal taxonomy.
There are two more possible parity settings, rarely seen, called "mark" and "space" parity. If one of these settings is chosen, the transmitted parity bit is always a mark or a space, respectively.
The
Linux system calls are declared in the file
/usr/include/linux/syscall.h. The assembly traps are found in
/usr/src/linux/arch/i386/entry.S
The only difference that comes to mind is that most libraries are dynamically loaded and the code is shared by all processes using the library simultaneously.
Since major
numbers identify device drivers, there has to be a convention
specifying which number corresponds to each device driver. This can
be found in the file /usr/src/linux/Documentation/devices.txt.
It could reasonably be argued that the
purpose of pseudo-terminals, virtual terminals that do not have
physical UARTs and do not modify their behavior in response to
ioctl(2)s changing UART settings, is to provide these non-UART
settings and behaviors to interactive programs like the shell in
situations when there is no real UART involved such as a network login
or X Windows session. See Pseudo-terminals.
In fact, input and output speeds can be set independently
to unequal values using the cfsetispeed and cfsetospeed
functions, but this is rarely done in practice.
The
GNU C Library actually allows input (RTS) and output (CTS) flow
control to be independently controlled using the CCTS_OFLOW and
CRTS_IFLOW flags.