NFS and Lustre

Last updated on Jun 15, 2023 1 min read

I’ve seen both NFS and Lustre in many clusters. To understand their difference, I decide to dig into it. Both of them can:

share a local file system name space across the network;
be robust and resilient to network failures.

NFS

stateless by design: to make it more robust to network failure (state was introduced in NFS4)
ubiquitous: built for everything
easy to set up and troubleshoot
I/O performance does not scale well: it uses an in-band protocal. Control messages are sent together with payload, which limits the size of payload. Read operations scale better than write operations.
supports POSIX
licensed by NetApp

Lustre

stateful by design: maintain connection so as to
I/O performance scales : it uses a 3rd-party transfer. Requests are made to the metadata server and IO moves directly between the affected storage component(s) and the client. When adding more storage/servers, the contension for resource will be reduced, therefore, it is more scalable.
supports POSIX
open source version

Choice between NFS and Lustre

NFS: if file operations are mostly read and it does not require sustained read or write bandwidth
Lustre: if I/O bandwidth is a concern

cluster filesystem