Setting up ``Secure'' internet services at an Academic Centre

Kapil Hari Paranjape

Introduction and Security Policy

In brief, the primary task of the network at an academic research institute is to provide computational, document preparation and internet facilities to the institute members. Additionally, the network generally also serves the scientific community in the neighbourhood by giving temporary accounts to people from other educational institutes and providing internet connectivity to some of these institutes. The above description also gives the order of priority for restoration of services in case of any break-down or disruption.

In what follows we concentrate on network services that deal with internet access. This is the traditional focus of security and firewall configuration. However, security of the network also means protecting the ability to provide the services mentioned above without loss of continuity. In particular, we ignore at our own peril other aspects of security such as power supply, physical protection of the devices, hardware maintenance and backup mechanisms.

To simplify the discussion we divide the network into four zones which we briefly describe in the first section. The guidelines for the configuration of various machines are discussed. In later sections we will examine the relevant portions of the configuration of each machine in detail. Not every aspect of TCP/IP networking can be explained in such a short document. The various ``HOWTO'' documents that are distributed with GNU/Linux are an invaluable source of additional information. The ``Further Reading'' sections of those documents will guide the reader into further and deeper understanding.

One important principle is that the policy assumes that legitimate users of the network will not try to break the network. In other words, once access to servers in the innermost zone is obtained, there are no additional checks at the network layer for access to any connected machine. Thus access controls are similar to standard password-based authentication.

1 Overview of Network Services

We begin with a discussion of the various active components that are taken into consideration and configured. For clarity we have divided the network into four zones. These are the entry points, the external services, the internal services and the firewall.

The basic philosophy is that machines on the internet will be allowed access only to the external service providing machines (sometimes referred to as the DMZ). The firewall, then provides restricted access from these external servers to the internal servers. The entry points should be configured so that the DMZ network does not contain spurious/superfluous traffic.

In the converse direction, users on the network are provided essentially un-restricted access to the internet. However, certain limited access services, such as on-line journals or some remote login stations may require additional authentication. The firewall and internal servers should be configured to handle this.

The firewall can easily be pierced by a user within the network. A program that runs on a machine within the network can create a link to some machine outside. This link can act as a tunnel to connect to arbitrary machines within the network. In order to initiate the tunnel the user does not have to be logged in via a ``fast'' link it can even be one of the slower links.
In each zone, one machine could be used to to provide more than one of the services required. This should be done with care to see that this does not increase the risk. In the later sections we will see which services can (and in some cases should) be combined.

1.1 Entry points

The network usually has one primary external entry point. This is the router that connects the network to the internet. In addition, there are often telephone modems to provide connectivity for home-based users or for neighbouring institutes. The simplest solution is to assign these modems to the ``access'' machine that resides in the DMZ (see below). Often the modems are actually connected to a machine in the internal network but this can easily lead to some security problems. Whichever solution is adopted, we will not treat the modem entry points as such and treat them as terminals attached to an access machine; which is securely configured.

The router must be configured to deny incoming packets which (spoof or) give the source as that belonging to an internal IP address or one of the designated address ranges for networks not connected to the internet. In addition, the low port servers, SNMP server, HTTP server, and icmp redirection on these routers should be disabled as part of the default security configuration recommended by most router manufacturers. Some of these services, notably SNMP and HTTP could be re-enabled after proper configuration.

In addition to these devices the network configuration often depends on various switches; if so any switch that connects the router to the external servers and firewall should be passive. If it is active then it needs to be properly configured to prevent it from being used as a source of attack. The same applies if a terminal server is used instead of a multi-port card on an access server.

1.2 External Services

The number of machines that are used to provide external services depends, to some extent, on the expected load. These servers will provide services to those on the internet who wish to access information residing within the network. Potentially, all information available on these servers must be considered vulnerable to read/write/modify access. Good configuration and intrusion monitoring should ensure that such access is limited to that permissible under the security policy.

At the very least one needs a mail exchanger, a name server, an external access station and an httpd-accelerator. However, many of these services can be combined on a single machine if so desired.

Ideally, these machines should not have any software other than that essential to the service(s) provided. In some cases it may be reasonable, desirable and practical to run the services in a chroot'ed environment. Finally, it should be possible to upgrade or apply security patches to these machines within a short period of the vulnerability becoming known.

With all these considerations in mind it seems reasonable to choose one common platform for all machines. While, this means that the same vulnerability may be found simultaneously on all machines, a routine application of security patches as they become available should avoid this difficulty. In case diverse platforms are chosen, the requirement of monitoring multiple security vulnerability mailing-lists outweighs (in the humble opinion of the author) the somewhat dubious additional security obtained. In any case different servers will have different software suites (based on the service provided), thus making it unlikely that the same vulnerability is found on all machines at once.

The common platform could be a GNU/Linux distribution such as Debian that maintains regular updates for security vulnerabilities. Other choices such as Red Hat, Bastille and so on are also possible. Two important considerations are that a limited amount of software be required to install and upgrade the system (in the author's not so humble opinion command-line tools such as Debian's apt are the best in this context); secondly, stability is a key consideration and ``bleeding-edge'' distributions should probably be avoided (as the system administrator is the one who is left bleeding!).

Some system administrators may want to examine OpenBSD or other BSD variants; the author has limited experience with these systems but OpenBSD in particular, is supposed to be very good in its implementation of the various points raised above.

Yet others may wish to examine the offerings of the proprietary vendors. In the author's not so humble opinion (based on real experience), these should be rejected outright on the following considerations (detailed explanations are available in a companion document):

  1. The out-of-box variants of these systems are often designed for ease of installation and use rather than security considerations (e. g. the Windows systems 95, 98, XP, Mandrake GNU/Linux and early versions of Solaris for workstations).
  2. Turning off un-needed services on these systems can be likened to tooth extraction (e. g. try extracting CDE from Solaris).
  3. Most of the services require purchase of additional software with possibly limited licenses for use (e. g. the Windows NT mail exchanger).
  4. Without the freedom of source being made available for perusal and modification, auditing/securing the systems is an impossible exercise (all proprietary systems suffer from this).

1.3 Internal Services

The internal services that we consider here are of two kinds; first of all there are support services for our own external servers and internal users such as a mail hub, a primary name server, a web server, a file server and an authentication server; secondly, some servers are required to regulate and/or accelerate services that local users wish to access from the internet, for example a web cache-server and a name lookup server.

Some of these services may combined on one machine; indeed this may be desirable to reduce inter-dependency. It may be a good idea to restrict login access by users to reduce the load on these servers.

Since the only machines that will access the services provided are those that have been configured by the system administrator and machines on the internal network, it is reasonable to give these servers a secondary security status as compared with that required by the external servers. It is still necessary to apply security related patches reasonably often and keep track of vulnerabilities on the web server, mail server, name server and authentication server.

While the configuration described in this document assumes that users can be trusted not to try to break the security of the system, it is still necessary to protect against such breaches being initiated by them inadvertently. Thus it is best to avoid platforms that are known to be susceptible to viruses and other ``automatically'' activated software. In particular, it should not be possible for a user to install repeatedly invoked programs (terminate-stay-resident, daemon programs, cron or batch jobs) on a system used for such services.

From these points of view it is still desirable to stay within the Unix-like platforms for such services and avoid the Windows platforms.

1.4 Firewall

The firewall is configured to forward, filter, re-direct and masquerade traffic to and from our network. These decisions are usually based on packet headers. In brief,
  1. To deny access to the internal network from addresses that are dummy internet addresses or otherwise known to be spoofed.
  2. To forward connections initiated from local machines to machines on the internet.
  3. To deny connections to the internal network from any address not belonging to one of the external servers.
  4. To allow service-specific connections to the internal servers from the external servers; for example, the httpd accelerator will connect to the ``real'' web server, the mail exchanger will deposit mail with the mail hub and so on.
  5. To redirect specialised service requests from local machines to the relevant proxy servers or special ports; for example, the web requests could be handled through a web cache.
  6. To re-write the source addresses of other service requests so that the addresses internal to the network are not visible on the internet.
Clearly, this is the most critical component of the network, from the point of view of continuity of service and to repel possible intruders. All remarks made regarding the choice of platform for the external servers apply. At the same time this machine does not need much software apart from the GNU/Linux operating system, the init program, a few network utilities, a shell which supports scripts and perhaps sshd in order for the system administrator to login and perform maintenance tasks. There are platforms (such as the ``Linux Router Project'') which even bypass this latter step as maintenance is performed on another machine which acts as a boot server for the firewall.

The firewall machine should thus be a GNU/Linux platform. We have chosen to install this using the guidelines from ``Linux From Scratch''. While this makes it difficult to keep track of the security updates that are required, it ensures that only software that is absolutely essentially to its functioning has been installed on this machine. At a future date it may be more convenient to install Debian GNU/Linux or some other version.

2 Configuration of Name Services

The name service configuration is based on the three kinds of servers:
  1. An external server to respond to queries from hosts on the internet which wish to resolve the names or addresses within our domains.
  2. An internal server to store and manage the databases for our assigned domains.
  3. An internal server to respond to queries generated within our network.
The last is not entirely essential as each machine on the inner zone could resolve names and addresses on its own. However, it is generally a good idea to at least have a server which will respond for queries regarding local machines (to prevent those connections from being spoofed or hijacked). It is also a good idea to create a pool of lookups to speed up access. The latter aspect allows one to club this with the ``proxy'' servers described later.

Note that ``on the internet'' the primary name server for the assigned domains is announced as the first machine but the program on this machine runs as a `` slave'' to the second machine which is the ``real'' name server in the internal zone. Thus the actual files are inside the firewall, but the data is accessible from outside.

2.1 External Domain Name Server

A ``chrooted'' DNS server should be used. In particular, a subdirectory should be created with copies of various files. The name service program runs with this directory as its root directory and cannot access any other part of the file system on the machine. Moreover, a new user with a new group should be created which is the user-level privilege with which the daemon runs. This may require minor modification of the startup script for the DNS program as well as the startup script for system logging.

The name server responds to queries regarding all zones for which it is a name server. It also allows arbitrary ``recursive'' queries from our local external zone machines as well as the ISP's name servers. It does not allow such queries from the internet. The program should accept connections made to designated port for name service at the address assigned to the primary name server for our domains.

This machine can also act as a server for other external services as name service of the above kind usually does not create much of a load on the system.

2.2 Internal Primary Name server

This name server is the ``real'' server where all the authoritative data regarding our domains is stored. However, it serves this data only to the other name servers operating within our network (internal as well as external zones). It can also accept arbitrary ``recursive'' queries from such hosts.

This machine can also act as a server for other internal services as the above service is not much of a load. In particular, if the internal network has an ``private'' name service such as nis, ldap or wins, then it is probably good to use this machine at least as a ``slave'' for such services so that the hostname/address database management tasks can be simplified.

2.3 Internal Name Lookup Cache

This is the server that is the first DNS resolver for each machine in the internal zone. Thus it caches the name lookups for all the machines on the internal network. In addition, it is a good idea to keep it as a ``slave'' DNS server for our domains so that local lookups can proceed quickly; however, it should not be listed as a name server in the Start of Authority (SAO) record for the domain.

This service should probably be coupled with Web Object Cache (or less preferably with the Mail Server) so that the name lookups associated with that service can add to the name service cache.

3 Configuration of Mail Services

Mail services are based on two types of servers:
  1. An external server to send/receive mail from hosts on the internet.
  2. An internal server or mail hub which handles actual delivery of local mail (and generation of external mail based on user forwarding instructions).
It might also be a good idea to split the first service over two servers.

3.1 External Mail Server

If possible a ``chrooted'' mail exchange program should be used. It should be configured to allow relaying to/from ``internal'' hosts but deny all other forms of relaying (except for other domains with which a prior exchange arrangement has been worked out). All mail that this machine accepts for our domains is passed up to the mail hub for actual delivery. For these services it should accept connections made to the designated port for mail service at the address assigned to the primary mail exchanger for our domains.

All mail originating from the local machines is masqueraded as coming from our domain in order to ensure that ``reply'' modes work for recipients. Finally, some mail generated in the external zone by various monitoring processes should be sent to ``real'' users (system administrators) at the mail hub.

Since the server needs to perform lookups for domains and addresses it is a good idea to have at least a caching name server running on this machine. It can also serve some other low load services.

3.2 Mail Hub

The mail hub runs a mail delivery program so it needs to have write access to the mail spool areas. It may also need to re-direct mail based on aliases and forwarding. Thus it should have access to the user database and the user ``home'' directories if that is where the .forward files are kept. In particular, if these are managed through nis, wins or ldap, then this machine should be at least a slave for such services.

4 Configuration of Web Services

The Web services are based on three types of servers:
  1. An external Web Accelerator which caches the pages for which there are requests from the internet.
  2. The ``real'' Web server(s) that resides in the internal zone.
  3. A Web Object Cache that actually makes all requests from the local machines to Web servers on the internet.

4.1 Reverse Web proxy

This machine resides in the external zone and listens for incoming requests on the designated port for web service at the address of the web server that is announced on the name service. It then checks for the relevant page on the internal ``real'' web server; if that page has not been modified since it was last cached it is recovered and send to the remote client; otherwise the cached page is served. Access control lists for known ill-behaved ``robots'' and other clients can also be implemented.

This service is likely to put some load on the serving machine which depends on how heavily the domain's site is accessed. If the traffic is low the machine can probably handle other services.

4.2 The Web Page Server

This machine which resides in the internal zone is the ``real'' web server. It should have access to the user database so that it can serve the individual user home pages. Thus it should be a replicant or ``slave'' for the nis, ldap or wins services that provide internal name service. The home pages should be stored on this machine and exported to the machines that users usually log in on so that they can make modifications.

This server has low to medium load depend on the demand for Web pages based in our domain. It can also serve additional pages for the local domain such as a calendar; it can be a local software repository and provide other such services. Care should be taken to configure the Web server so that this internal data is not visible to the external requests (which are made via the Reverse Web proxy).

4.3 The Web Object Cache

This machine resides in the internal zone and handles all web requests originating from local machines. This makes it easier to configure access to online journals that require all requests to to originate from a specific IP address. Additionally, most users on the network access similar data from the Web so caching helps in speeding up access. This machine must thus be configured to accept all traffic directed at the designated port for web service for any address and redirect it the program that caches the data.

As caching name to address lookups is an integral part of this service it is probably a good idea to use this machine to cache name lookups as well. These two services will probably load any machine sufficiently that it cannot serve any other purpose.

5 User Access Services

This is of course the primary purpose of the internal services. To provide access to the legitimate users and store their data we need the following servers:
  1. An external access server which provides users access to their files and mail when they do not have physical access to the network.
  2. A external login server which provides a restricted access to some users who wish to use computational facilities on the internal network.
  3. An internal File and Network server that stores user files, and the user information database.
The first and third machines will probably not be able to handle any other services.

5.1 Access server

Most users who wish to the access their files and mail while physically absent would be provided services on this machine. In particular, it needs to act as a client for the File and Network Server described below. Mail generated by users on this machine is passed to the Mail Hub. Finally, this machine provides terminal and X-terminal based logins for machines in the external zone.

Primary access to this machine should be through secure protocols such as ssh. In addition, logins via telnet, ftp and mail access via imap and pop should also be allowed. Via these services all user files and mail that reside on the common File and Network Server become potentially ``visible'' on the internet. Thus users must be informed that they need to take other precautions to make their data ``invisible''. This can be through storage of sensitive data in private storage media or by strong encryption of the data.

Since the protocols (other than ssh) perform password verification without data-encryption it may not a good idea to allow such access from arbitrary machines on the internet. Once SSL (Secure Socket Layer) or IPSec versions of the clients for these services become commonly available these should be used. Some form of weak access control such as reverse lookup of the connecting address can at least give some confidence that the user is not using a ``maliciously configured'' remote machine that will record key-strokes or passwords.

5.2 Login Server

Some users may wish to access the internal zone on a regular basis. Primarily, these are users who belong to another organisation but run programs on some specialised machine which resides in the internal zone. Thus a service is provided whereby users login to a restricted account on an external server called the ``login server'' and then can further login to a machine on the internal zone.

We can provide such access through the use of a public-key mechanism (as provided by ssh); additional security is available through the use of IP-address based restrictions on the hosts from which connections are permitted. Users must provide a pair consisting of a public-key and an IP-address in order to make use of the service. The account that they log in to has a restricted shell. Expiry records should be maintained for these services.

This is a low-load service and can be combined with other services.

5.3 File and Network Server

This is the server which provides user home directories and mail folders (the Web home pages of the users are usually stored on the Web Page server to speed up access). The same server also provides user name and authentication (nis) services (this is convenient since this is required to authenticate match users to files. The most common protocols for this are based on rpc or Remote Procedure Calls. The access server outside the firewall must have access to this data but the original rpc has no static port-to-service mapping makes it difficult to configure the firewall appropriately. One alternative is to use newer versions of the programs that allow static port assignments. Another is to make dynamic changes to the firewall configuration based on port lookups. It may also be a good idea to use a different file sharing and user authentication protocol such as AFS with Kerberos.

6 The Firewall Machine

The firewall is implemented on a machine with two network interfaces, one connected to the internal zone and the other to the external zone and the router. Thus, when this machine is off there is no connection between the internal zone and any other zone. The configuration of this machine is critical to the correct functioning of the system as a whole. Ideally this machine should have a console where no-one except the system administrator can log in. Non-console logins should be limited to those based on ssh from some selected machines in the internal zone. No services other than this should run on this machine.

6.1 Network Interfaces

The network interfaces should be labeled (internal and external). The internal interface should be given an address which is configured as the default ``gateway'' for all machines in the internal zone. The external interface can be given one or more addresses that will be used to masquerade the connections made to the internet from machines in the internal zone. A list of hardware addresses of the network cards of critical servers in the internal and external zones should be loaded in the arp table of this machine to prevent such services from being disrupted by someone inadvertently (or maliciously!) assuming the IP address of a critical service.

6.2 Basic Firewall Setup

The firewall setup is organised as a packet filter and router based on a number of scripts. Each one limits connections by ``fall-through'' rules. The first list that is relevant is the pre-routing list.

  1. Packets from ``dummy'' private addresses (RFC 1918) should be rejected if they arrive at the external interface.
  2. Packets from addresses that are known to belong to the internal zone should be rejected if they arrive at the external interface.
  3. Packets addressed to the http service that arrive on the internal interface and do not come from the Web Object Cache are ``marked'' for re-routing.
The marked packets are re-routed to the Web Object Cache for further action. The way the remaining packets are filtered depends on the relevant IP protocol. For the ICMP protocol, the only useful packets are ``echo'', ``echo reply'', ``redirect'' and ``timeout''. We drop all others. The TCP and UDP protocols are handled by a new list of filtering rules as follows (recall that these are all fall-through rules).
  1. Packets which are related to existing connections are accepted.
  2. New (outgoing) connections are permitted if these are made by machines in the internal zone (checked by interface as well as IP address).
  3. New (incoming) connections from machines not in the external zone (checked by IP address) which arrive at the external interface are rejected. (Note that the router has already been configured so that such packets cannot come from the internet).
  4. A separate rule list is created for the external zone. All packets arriving at the external interface from one of the external zone machines are re-directed to this interface.
  5. The default policy is to reject all remaining packets.
The rule list for the external zone is further separated on basis of services and permits connections to be initiated by machines in the external zone to specific services on specific internal zone servers as described elsewhere in this document.

Finally, Outgoing packets need to be masqueraded:

  1. Packets going out from the Web Object Cache to the internet at large are masqueraded as coming from a specific address which has been registered with online journals.
  2. All other packets going out to the internet at large are masqueraded as some specific list of ISP-provided addresses so that the remote machines can find return routes.
Note that packets going out to machines in the external zone are not masqueraded as those machine need to know the precise internal zone machine that is sending the packet.

Acknowledgements

The author received a lot of help, suggestions and criticisms from the following people: Rahul Basu, Pablo Ares Gastesi, T. Jayaraman, Krishna Maddaly, Suresh Rao and G. Subramoniam. In addition to them I must thank Meena Mahajan, Madhavan Mukund and M. V. N. Murthy for the reasonably well-documented system at IMSc that existed before I started tampering with it. Finally, not enough thanks can be said to Richard Stallman, Linus Torvalds and the numerous GNU, Linux and Debian hackers on whose software efforts the project rests.

Kapil Hari Paranjape 2002-01-07