With the continuous development of network communication technology, the networkbandwidth is increasing quickly. The processing ability of network stack can easily becomethe bottleneck the overall system performance. System call, memory copy, and protocolprocessing are the main overhead for traditional kernel network stack, which can largely limitthe processing capability of network stack. User-space network stack is another way toimplement network stack, which has the advantage of low-overhead of operating system andmemory copy, as well as high customization flexibility. The paper focuses on two keytechnologies of user-space network stack:User-space network stack is driven by two message flows: network data packet messageflow and socket request message flow, which is a multi-producer and single comsumer model.This paper firstly designs a circular message queue to support message communicationmechanism. However, the lock synchronization overhead on the message queue underhigh-speed traffic can be considerable. To solve this problem, the paper presents an improvedmutual exclusion algorithm of message queue based on atomic operation, which is suitable foroccasions that need to frequently determine whether the message is empty or full. Then, thepaper designs a lock-free architecture for message communication mechanism, which usesround-robin scheduling to substitute the use of mutex.Finally, this paper studies the parallel processing techniques of user-space network stack.With the increasing popularity of multi-processor and multi-core architecture, the networkstack can not make the best use of processing capability of CPU cores if it remains in thesingle-threaded model. This paper designs the parallel processing architecture of user-spacenetwork stack, which distribute data packets to specific network stack thread by hashing thefour-tuple of data packet. Then, a load balancing strategy between processors is implementedusing the “hard affinity†character provided by Linux kernel. At last, several experimentalsshow that the parallel processing of network stack has achieved a relatively good result. |