Information about arpd: The user-space ARP daemon for Linux

Installation

This is the first public release version of arpd, a daemon intended to move most of the kernel's ARP queries to user space. For this version, you will need a post 1.3.95 kernel and the following tar.gz file (tested most recently against 2.0.0):

Installation is pretty straight forward:


Motivation

The current implementation of the Linux ARP mechanism involves maintaining a hash table in kernel space with 4-bit keys. While this works for small (<= class C) networks, it does not scale to large networks. Even if the hashing function were changed, there is still a fundimental problem in that there is only a finite amount of memory available to the kernel. Experience has shown that trying to fill the Linux arp table on a network having a mask of 255.255.240.0 will tend to bring your system to its knees very quickly. Note that you don't need lots and lots of machines to wreak havoc on your poor kernel. Unresolved ARP entries actually consume more memory than resolved entries! So, even if you only have 2 machines on a 255.255.240.0 network, but try pinging all addresses in the netmask, your kernel will likely complain.

Implementation

arpd communicates with userspace via the netlink driver. By enabling both netlink and arpd support, the immediate effect is the kernel's internal ARP table will not grow to more than 256 entries. In fact, your kernel should behave exactly as it did before the patch, until your arp table grows to 256 entries in size.

WARNING: If you run your kernel with arpd support, try to connect to more than 256 IP addresses on your local network, and are not running arpd itself, you run the risk of ARP storming your local network. Be careful! Don't forget all the other usual precautions and disclaimers involved in building kernels!

Two types of messages are understood by arpd. ARPD_UPDATE does not require a reply and instructs arpd to replace the specified ARP entry. ARPD_LOOKUP asks arpd to send the ARP entry associated with an IP address, if one exists.

The search algorithm I chose uses a trie with each octet being represented on a unique level of the trie. It is one-to-one and the trie is built dynamically on demand (ie. I don't go and allocate 2^32 entries at startup!) I figure this algorithm should scale well to IPV6 too.


Cautions and warnings aside, if you are running Linux on large networks, you may very well need arpd. If you do decide to play with this, please let me know about any problems you encounter. Also, I'm quite new to the world of kernel hacking, particularly in the network code. There are almost definitely things I have missed in my crash course to the internals of the kernel's network code.

Many thanks to Allan Cox, Bjorn Ekwall, and Alexey Kuznetsov for answering questions and pointing out bugs.