Basic TCP/IP Theory
In order to do sockets programming, you need to have a basic knowledge of what TCP/IP is and how it works. This page is intended to introduce you to that information.
Basically, TCP/IP is one particular collection of networking protocols, one of many. What makes TCP/IP so special is that it is the protocol suite that the Internet is based on and so support for it has become almost universal.
TCP/IP stands for "Transmission Control Protocol/Internet Protocol":
- The TCP portion deals with the manner in which computers connect and communicate with each other.
- The IP portion deals with using the address of the destination computer to find a connection to that computer's local network regardless of where in the world it is on the Internet.
The unit of transfer in TCP/IP is the packet. A packet is an individual message, which is sometimes compared to a snail-mail letter. Each packet has a header which contains a source address, a destination address, and additional information needed by the system. It also has a data block, which is most often the actual message being sent. The format of the packets' headers are well-defined, whereas the format of the data block depends entirely upon the application-level protocol that is using it.
Within the framework of TCP/IP, the TCP portion deals with the creation and handling of the packets themselves while the IP portion deals with getting those packets to their destination.
Sockets programming deals with the creation and handling of these packets. You normally never need to deal with the actual format of the packet; the data in the packet is the only thing whose format you have any control over. In general, you provide the destination address, the data block, and which protocol (tcp or udp -- see below) and the system provides the rest.
|Data Link Layer|
Sorry to do this to you, but just about every book on the subject presents either this layer model or the 7-layer OSI model. Still, it is useful for keeping a few concepts straight. My treatment of it here is far from complete.
The basic function of each layer is:
- The Data Link Layer handles the physical connection between the computers and the actual transmission of data over the wire.
- The IP Layer handles IP addressing and the routing of packets from source to destination. This is where the IP portion of TCP/IP does its job.
- The Transport Layer (AKA "TCP Layer") establishes and maintains the communications session between the computers and handles the transmission and receipt of data packets. This is where the TCP portion of TCP/IP does its job. This is where most sockets programming is performed.
- The Application Layer creates the data to be sent and interprets and processes the data received. It also manages the overall session between the applications on the two computers.
This layer model helps to keep a few important concepts in mind:
- The overall networking task is split up among the different layers, with each layer performing a specific part of that task.
- The data flowing out of the source computer heads down through the layers from application to data link and the data flowing into destination computer heads up through the layers from data link to application.
- The connection effectively works as if the corresponding layers of the connected computers were communicating directly with each other.
- Each layer has its own protocols that govern how it communicates with its opposite number on the other computer.
- Each layer does its own job without having to know exactly how the layers above and below do their jobs.
Sockets programming operates primarily within the Transport Layer.
TCP itself is divided into two different protocols: tcp and udp. The difference between tcp and udp is in the type of connection that they form between the computers and the consequences that has for the transfer of data:
There are a number of factors to consider in choosing between tcp and udp, such as:
- tcp (transmission control protocol) is a "connection-oriented" protocol. This is often compared to a telephone conversation.
- Data Stream of Packets
- tcp packets are considered to be part of a data stream which uses multiple packets.
- Confirmed Connection
- A confirmed connection is established through a short series of synchronization requests and acknowledgements -- AKA the "Triple Handshake". No data gets sent until that connection has been established.
- No Third Parties
- Each connection can only involve two computers at most. I.e., the connection could be between two different computers or between two ports on the same computer, but a third party cannot join the connection.
- Guaranteed Delivery
- The data is guaranteed to arrive at its destination intact. The destination acknowledges receipt of an intact packet. If the source does not receive an acknowledgement or receives a retransmission request for a particular packet, it resends that packet.
- No Data Size Limit
- The network limits the maximum size of a packet. If the data is too large to fit into one packet, it gets split up into smaller packets for transmission and sent with additional information for the destination to reassemble it. When these packets are received, the destination uses the additional information to reassemble the original data.
- udp (user datagram protocol) is a "connectionless" protocol. This is often compared to sending a telegram or a snail-mail letter.
- Single-Packet Datagrams
- The datagram is a common networking concept found in many protocol suites that has been used in network programming since long before TCP/IP. A datagram is a single self-contained message that is sent between hosts. Each connection is the transmission/receipt of a single datagram.
- Data Size Limited to a Single Packet
- All the data is sent in one single packet, usually no larger than 512 bytes of data. If the data is too large to fit into a single packet, then udp cannot be used.
- No Connection
- The packet is just sent. "Send and forget." No connection is established, no confirmation of receipt is expected. udp won't even check to see if the destination even exists.
- Multiple Destinations Possible
- The same packet can be sent to any number of destinations -- one at a time, of course. Or else a datagram can be sent as a broadcast or a multicast.
- No Guarantees
- There is no guarantee that the packet will arrive nor that it will arrive intact (a corrupted packet will fail checksum and be rejected by the destination). If you want confirmation or a retransmission procedure, then you will have to implement it in the application itself, since it does not happen in the protocol. Or else use tcp instead.
Many sources discuss these factors in greater detail and clarity, such as Jon Snader's Effective TCP/IP Programming: 44 Tips to Improve Your Network Programs. In general, tcp seems to be used more often than udp, but there are applications for which udp is the clear choice.
- the amount of data being sent and the size of the data blocks
- the number of recipients
- the extra overhead of tcp -- establishing the connection, acknowledging receipt, retransmitting
- how reliable the transmission must be
IMPORTANT NOTE:In a given sockets connection, both the source and the destination must use the same protocol. You either use tcp or you use udp; you cannot use both at the same time. It is a common mistake to try to connect to a tcp server with a udp client or vice versa. It just won't work.
Of course, if the application use multiple sockets (eg, FTP uses two connections; one for control messages and the other for data transfer), then some of the sockets can use tcp while the others use udp. But the rule still applies that an individual socket must be purely tcp or udp.
First, we need a few definitions:
- protocol stack -- the hardware and software needed to implement the network interface.
- NIC -- Network Interface Card. This is the actual hardware connection to the network. Every host needs at least one NIC.
- multi-homed -- a host that has more than one NIC.
- host -- any device, including computers, which has a NIC on a network.
- peer -- the other host in a network connection.
In TCP/IP, each address consists of:
In every sockets application, you designate which protocol you are using and which local port (though you can allow the OS to assign a port for you) -- please note that in most cases your own IP address is already known, though you may need to specify one if you are multi-homed. Then if you are making a connection with another host, you designate the IP address and port you want to connect to, or, if you are acting as a server, then you get the client's IP address and port when it connects to you.
- An IP address. This gets you to a particular interface on a particular computer. Every host on a network and every host on the Internet must have its own unique IP address; there must not be any duplicates. Read the next section for a description of IP Addressing.
- A protocol. In our cases, that would either be tcp or udp.
- A port number. Conceptually, this is the point in the computer's interface where the actual connection is made between hosts. There are 65,536 ports for tcp and another 65,536 ports for udp, though the operating system will limit how many ports can be open at the same time.
So now we can answer that question that you must be asking: why is it called "sockets"? A socket is defined as a software construct for handling a port. You can think of the ports as the specific points at which the computers connect to each other and a "socket" as the software interface to a port. Or you could think of a "socket" as where your application plugs in to form the connection. Actually, both ports and sockets are software abstractions, since there aren't really 131,072 tiny connectors in there, even though it helps us to visualize that there are.
Each socket is uniquely identified by a pair of addresses: the local address and the peer's address. This is important to remember, because we must be careful not to associate a socket too closely with a local port. There is a cardinal rule that a given socket can only be used for one connection at a time. But if we run a server, say a web server on TCP port 80, and ten clients connect to us, then we will see that our local TCP port 80 is involved in each of those ten connections. That appears to violate that cardinal rule. However, what is really happening is that each of those are different sockets, because each of them has a different peer address. We are not using the same socket ten times over, but rather we are using ten different sockets. There is no actual conflict nor violation of the rule.
Refering back to the protocol layers, the IP layer recognizes its own IP address in the incoming packet header and passes the packet up to the Transport layer, which is where the ports are. There, the protocol and port in the destination address are matched to the peer socket that is bound to that port -- if any -- and the packet is read by the associated application. Of course, if the peer has not bound a socket to that port, then you will not be able to connect to it.
In TCP/IP, there are two sets of ports: the tcp ports and the udp ports, which are used by the tcp protocol and by the udp protocol respectively. There are 65,536 tcp ports and 65,536 udp ports, which are all divided into three ranges:This solves the problem that you should have been wondering about: how does a client know which port to connect to on the server? Obviously, he needs to know ahead of time which port is being used. Well, in the case of a well-known port or a registered port, like telnet(23) or http(80) or pcANYWHERE(5631), then he already knows because it has been determined before-hand. Otherwise, the client has to have been informed of the server's port through some other arrangement.
Port Range Used For: The Well Known Ports 0 through 1023 reserved for common services like telnet, ftp, sendmail The Registered Ports 1024 through 49151 registered by companies and organizations The Dynamic and/or Private Ports 49152 through 65535 available for the rest of us to use
In sockets programming, we attach our socket to a port (called "binding the socket") and send or receive packets to or from another port somewhere else, usually on another computer. The binding of a port to one address and connecting it to another address is accomplished through functions in the sockets API.
A list of the assigned port numbers is posted on IANA's site.
There are two forms of IP addressing: IP Version 4 (IPv4) and IP Version 6 (IPv6). IPv6 is the future replacement for IPv4, but IPv4 is still the most commonly used form and the one which I will cover. You can read about IPv6 elsewhere.
In the protocol layers, the IP layer is responsible for handling IP addresses and for recognizing its own IP address in the incoming packet header. In TCP/IP, every device on a network, including on the Internet, has a unique IP address. This is true of computers, network printers, and routers. Each network device has at least one network interface (e.g., a NIC) and some have more than one (e.g., a router, which by definition is connected to at least two networks). Each network interface has a unique IP address assigned to it.
In IPv4, an IP address is given as a series of four numbers separated by periods in what is called "dotted-decimal" format. Each of these numbers falls in the range of 0 to 255 and is called an "octet", because it is eight bits long. The reason that they are not called "bytes" is that for most of the history of electronic computers (about the mid-1940's to the mid-to-late 1970's), every kind of computer had its own definition of how long a byte was. On the other hand, the term "octet" specifically means 8 bits, so there is no ambiguity.
A device on a network is called a host. The first bits in an IP address are the network bits, which uniquely identify which network the host is on. The remaining bits uniquely identify the host. Exactly where the network bits end and the host bits begin depends on a number of things and is too involved to get into right now. Suffice to say that for two hosts to talk to each other, they either have to be on the same network or there has to be a router or routers that eventually connect their networks. This is one of the "gotcha's" that most beginners fall victim to.
If you need to set up your own local network and select your own IP addresses, then you should also read my basic instructions for selecting an IP address for a host. In addition, there are a lot of really good explanations on the Internet. There's no such thing as learning too much.
A mundane-sounding topic, yet this is a prime source of problems for the beginner. I know, because I've fallen for it, too.
Most data values in a computer are contained in strings of bytes. However, the order in which those strings of bytes are stored can vary from computer to computer. Terms like "big-endian" and "little-endian" are used to describe how computers organize multi-byte data. Since we are going to be exchanging binary data directly between a wide variety of computers, we need to standardize the format of the data.
Therefore, all multi-byte values used by the tcp and udp protocols must be in one specific order, called "network order". The order that your computer keeps these values in is called "host order". If your host order is different from network order and you fail to compensate for it, then your sockets will not work. Period. End of story. That's all she wrote.
The good news is that you do not need to know what order your local host uses. The sockets API provides a number of conversion functions, including ones that will convert host-order values into network order and vice versa. The functions for converting byte order are:
unsigned short htons(unsigned short hostshort)--
- "host to network, short", for converting a two-byte integer from host order to network order
unsigned long htonl(unsigned long hostlong)--
- "host to network, long", for converting a four-byte integer from host order to network order
unsigned short ntohs(unsigned short netshort)--
- "network to host, short", for converting a two-byte integer from network order to host order
unsigned long ntohl(unsigned long netlong)--
- "network to host, long", for converting a four-byte integer from network order to host order
The recommended procedure is that you keep the values in host order and convert them when you move them into and out of socket structures, such as when dealing with port numbers. They should also be useful when constructing or parsing the data block.
As I learned years ago working construction: let your tools do the work for you.
Return to Top of Page
Return to DWise1's Sockets Programming Page
Return to DWise1's Programming Page
Return to DWise1's Home Page
Share and enjoy!
First uploaded on 2003 July 26.
E-Mail Address: firstname.lastname@example.org.