User – Kernel space communication in Linux
User – Kernel space communication in Linux

User – Kernel space communication in Linux

Applications running in the user space can’t directly access the memory in kernel space.

In such a case, for exchange of information with the kernel modules or vice-versa, Linux provides a set of standard mechanisms.

We’ll look discuss some of the commonly used mechanisms and see which suits which use cases.

Procfs & Sysfs

Procfs (/proc) and Sysfs (/sys) are a window to the user space from the kernel space.

Both of these are virtual filesystems; loaded and populated on boot and discarded on shutdown. The kernel components present various kinds of process and system information in these files.

Sysfs being a newer sibling, is much more structured and restrictive than its older counterpart. Both of these interfaces can be used to set/get data/attributes of kernel modules e.g. setting a driver attribute to enable logging.

Essentially following happens:

  • You register your attribute handlers in the kernel module
  • Expose the attributes to user space by creating entries in the sysfs/procfs space
  • User space application read/writes these attributes to exchange data between user and kernel space

Procfs takes a file_operations struct, which contains pointers to functions controlling the behavior of the regular file system calls (read/write/open/mmap etc.). We can do any arbitrary action on the file operations, using these functions.

Sysfs only allows two methods show & store which can be used to read & write the attribute values.

sysfs allows to read/write only a single value per file. Any thoughts??

The files represents an attribute. Allowing only single value per file, enables a very fine grained control on the access rights.
For e.g. a non root user may be restricted to modify only a set of the attributes.

Further references:

Configfs

As opposed to sysfs, the creation of kernel objects and related attributes are done in the user space, in case of configfs. 

Other than the kernel object and attribute creation/registration, which is now done in userspace, through mkdir/rmdir calls; the kernel counterpart still handles the implementation of the show/store functions for the attributes. On each mkdir() the kernel responds with creating the attributes (files) and then they can be read and written by the user.

Why not use sysfs for the purpose??

Configfs gives a flexibility to expose different characteristics based on some conditions, that is decided in the user space.
For example, USB gadget drivers use the configfs interface heavily. Based on users’ choice, the same USB connected phone can be exposed as a mouse or keypad or mass storage.

Further references:

Debugfs

Similar to other virtual filesystems, debugfs is typically used by kernel modules for debugging purposes.

A common use is to expose some driver level stats to the user space. Using debugfs, a kernel module can exchange unstructured data with the user space.

Further references:

Memory Mapping

Memory mapping or mmap, as commonly known, is one of the fastest way to exchange data.

This is the preferred way for moving large amounts of data, for e.g. video buffers from kernel to user space, DMA transfers etc.

Typically the required memory buffers are memory mapped, guaranteeing that the physical pages backing the buffers will not be swapped out of the memory at any time.

void * mmap (void *address, size_t length, int protect, int flags, int filedes, off_t offset)

mmap() is a powerful and useful system call, but the function is not portable to non Linux environments, and hence must be used with caution.

Further references:

Relayfs

The relay interface is primarily designed for large amount of data transfer from kernel to user space in an efficient way, through the use of user-defined relay channels.

These relay channels are backed by either per-cpu or system wide kernel buffers. Relayfs provides the option of lockless event logging, which gives a better performance as compared to locking mode. Note that lockless logging has its own sets of caveats, and hence should be used carefully.

The high level idea of relayfs is simple:

  1. Define channels (buffers) in kernel for a subsystem
  2. Map these channels in userspace as files
  3. Use the regular read() calls to get the data from these files

While it may be not as efficient as mmap(), it’s easy to use due to the file interface.

Further references:

IOCTL

IOCTLs are majorly used when interacting with device drivers.

Steps to define and use the IOCTL are:

  • Create IOCTL command in the driver
  • Write IOCTL function in the driver
  • Create IOCTL command in a Userspace application
  • Use the IOCTL system call in a Userspace

An IOCTL can be defined as:

#define "ioctl_name" __IOX("magic_num","cmd_num","arg_type")

where IOX can be :
IO“: an ioctl with no parameters
IOW“: an ioctl with write parameters (copy_from_user)
IOR“: an ioctl with read parameters (copy_to_user)
IOWR“: an ioctl with both write and read parameters

  • The magic_num is a unique number or character that will differentiate our set of ioctl calls from the other ioctl calls
  • cmd_num is the number that is assigned to the ioctl. This is used to differentiate the commands from one another
  • The last is the type of data

Further references:

Sockets

Sockets in Linux are bidirectional communication mechanism, which can be used to communicate with another process in the same machine or with a process running in another machine.

Sockets are characterized by three primary attributes:

  • Domain
    • AF_INET : used for internet networking. Are associated with an IP port number.
    • AF_UNIX : used for local communication (within the same machine). Are associated with a file system pathname
    • AF_NETLINK : used for communication between kernel and user space
    • There are others as well, but not commonly used
  • Type
    • SOCK_STREAM : reliable 2 way communication, e.g. TCP
    • SOCK_DGRAM : unreliable 2 way communication e.g. UDP
    • SOCK_RAW : raw socket, carrying IP packets (no TCP/UDP header)
    • SOCK_SEQPACKET : sequenced-packet socket that is connection-oriented, preserves message boundaries, and delivers messages in the order that they were sent
  • Protocol
    • Is used to specify a specific protocol within the protocol suite socket(AF_INET, SOCK_STREAM,0); gives you a TCP socket, as TCP is the default streaming socket for the IP protocol suite. But if you want the SCTP protocol instead, you can use socket(AF_INET, SOCK_STREAM,IPPROTO_SCTP).
    • /etc/protocol contains the list of supported protocols. Bear in mind that not all combinations of protocol type & protocol are valid.

We are more interested in the AF_UNIX & AF_NETLINK domain, which is generally used in the communication between the kernel and user space.

netlink_socket = socket(AF_NETLINK, socket_type, netlink_family)

socket type -> SOCK_RAW or SOCK_DGRAM

netlink family -> selects the kernel module or the netlink group to communicate with

Majority of the networking tools in Linux use the netlink interface for exchanging data with the kernel. Linux has around 20+ already registered netlink families (with corresponding kernel modules) such as NETLINK_ROUTE, NETLINK_FIREWALL etc. which helps to expose the kernel networking state to the user space applications. Users can also define their own netlink family with corresponding kernel module, for custom data transfer.

Further references:

– Netlink is the preferred way of configuring any network related objects through sockets.
– Debugfs is used for ad-hoc interfaces for debugging functionality that does not need to be exposed as a stable interface to applications.
– Sysfs is a good way to expose the state of an in-kernel object that is not tied to a file descriptor.
– Configfs can be used for more complex configuration than sysfs
– A custom file system (relayfs) can provide extra flexibility with a simple user interface but adds a lot of complexity to the implementation
– mmap is the preferred way of exchanging large amount of data efficiently

Discover more from Open Knowledge

Subscribe now to keep reading and get access to the full archive.

Continue reading