In
computing, a 'file system' (often also written as 'filesystem') is a method for storing and organizing
computer files and the data they contain to make it easy to find and access them. File systems may use a
data storage device such as a
hard disk or
CD-ROM and involve maintaining the physical location of the files, they might provide access to data on a file server by acting as clients for a network protocol (e.g.,
NFS,
SMB, or
9P clients), or they may be virtual and exist only as an access method for virtual data (e.g.,
procfs).
More formally, a file system is a set of
abstract data types that are implemented for the storage, hierarchical organization, manipulation, navigation, access, and retrieval of
data. File systems share much in common with database technology, but it is debatable whether a file system can be classified as a special-purpose database (
DBMS).
Aspects of file systems
The most familiar file systems make use of an underlying
data storage device that offers access to an array of fixed-size
blocks, sometimes called ''sectors'', generally 512 bytes each. The file system software is responsible for organizing these sectors into
files and
directories, and keeping track of which sectors belong to which file and which are not being used.
However, file systems need not make use of a storage device at all. A file system can be used to organize and represent access to any data, whether it be stored or dynamically generated (eg, from a network connection).
Whether the file system has an underlying storage device or not, file systems typically have directories which associate '
file names' with files, usually by connecting the file name to an index into a
file allocation table of some sort, such as the
FAT in an
MS-DOS file system, or an
inode in a
Unix-like file system. Directory structures may be flat, or allow hierarchies where directories may contain subdirectories. In some file systems, file names are structured, with special syntax for
filename extensions and version numbers. In others, file names are simple strings, and per-file
metadata is stored elsewhere.
Other bookkeeping information is typically associated with each file within a file system. The
length of the data contained in a file may be stored as the number of blocks allocated for the file or as an exact
byte count. The
time that the file was last modified may be stored as the file's timestamp. Some file systems also store the file creation time, the time it was last accessed, and the time that the file's meta-data was changed. (Note that many early
PC operating systems did not keep track of file times.) Other information can include the file's
device type (e.g.,
block, character,
socket,
subdirectory, etc.), its owner
user-ID and
group-ID, and its
access permission settings (e.g., whether the file is read-only,
executable, etc.).
The hierarchical file system was an early research interest of
Dennis Ritchie of Unix fame; previous implementations were restricted to only a few levels, notably the IBM implementations, even of their early databases like IMS. After the success of Unix, Ritchie extended the file system concept to every object in his later operating system developments, such as
Plan 9 and
Inferno.
Traditional file systems offer facilities to create, move and delete both files and directories. They lack facilities to create additional links to a directory (
hard links in
Unix), rename parent links (".." in
Unix-like OS), and create bidirectional links to files.
Traditional file systems also offer facilities to truncate, append to, create, move, delete and in-place modify files. They do not offer facilities to
prepend to or truncate from the beginning of a file, let alone arbitrary insertion into or deletion from a file. The operations provided are highly asymmetric and lack the generality to be useful in unexpected contexts. For example, interprocess
pipes in
Unix have to be implemented outside of the file system because the pipes concept does not offer
truncation from the beginning of files.
Secure access to basic file system operations can be based on a scheme of
access control lists or
capabilities. Research has shown access control lists to be difficult to secure properly, which is why research operating systems tend to use capabilities. Commercial file systems still use access control lists. ''see:
secure computing''
Arbitrary attributes can be associated on advanced file systems, such as
XFS,
ext2/
ext3, some versions of
UFS, and
HFS+, using
extended file attributes. This feature is implemented in the kernels of
Linux,
FreeBSD and
Mac OS X operating systems, and allows metadata to be associated with the file at the ''file system'' level. This, for example, could be the author of a document, the character encoding of a plain-text document, or a checksum.
Types of file systems
File system types can be classified into disk file systems, network file systems and special purpose file systems.
Disk file systems
A ''disk file system'' is a file system designed for the storage of
files on a
data storage device, most commonly a
disk drive, which might be directly or indirectly connected to the computer. Examples of disk file systems include
FAT,
FAT32,
NTFS,
HFS and
HFS+,
ext2,
ext3,
ISO 9660,
ODS-5, and
UDF.
Some disk file systems are
journaling file systems or
versioning file systems.
Flash file systems
A ''flash file system'' is a file system designed for storing
files on
flash memory devices. These are becoming more prevalent as the number of mobile devices is increasing, and the capacity of flash memories catches up with hard drives.
While a
block device layer can run emulate hard drive behavior and store regular file systems on a flash device, this is suboptimal for several reasons:
★ Erasing blocks: Flash memory blocks have to be explicitly erased before they can be written to. The time taken to erase blocks can be significant, thus it is beneficial to erase unused blocks while the device is idle.
★
Random access: Disk file systems are optimized to avoid
disk seeks whenever possible, due to the high cost of seeking. Flash memory devices impose no seek latency.
★
Wear levelling: Flash memory devices tend to "wear out" when a single block is repeatedly overwritten; flash file systems try to spread out writes as evenly as possible.
It turns out that
log-structured file systems have all the desirable properties for a flash file system. Such file systems include
JFFS2 and
YAFFS.
Database file systems
A new concept for file management is the concept of a database-based file system. Instead of, or in addition to, hierarchical structured management, files are identified by their characteristics, like type of file, topic, author, or similar
metadata.
Transactional file systems
This is a special kind of file system in that it logs events or transactions to files.
Each operation that you do may involve changes to a number of different files and disk structures. In many cases, these changes are related, meaning that it is important that they all be executed at the same time.
Take for example a bank sending another bank some money electronically. The bank's computer will "send" the transfer instruction to the other bank and also update its own records to indicate the transfer has occurred. If for some reason the computer crashes before it has had a chance to update its own records, then on reset, there will be no record of the transfer but the bank will be missing some money. A transactional system can rebuild the actions by resynchronizing the "transactions" on both ends to correct the failure. All transactions can be saved as well, providing a complete record of what was done and where. This type of file system is designed and intended to be fault tolerant, and necessarily incurs a high degree of overhead.
Network file systems
Main articles: Network file system
A network file system is a file system that acts as a client for a remote file access protocol, providing access to files on a server. Examples of network file systems include clients for the
NFS,
SMB protocols, and file-system-like clients for
FTP and
WebDAV.
Special purpose file systems
A special purpose file system is basically any file system that is not a disk file system or network file system. This includes systems where the
files are arranged dynamically by
software, intended for such purposes as communication between
computer processes or temporary file space.
Special purpose file systems are most commonly used by file-centric operating systems such as Unix. Examples include the
procfs (
/proc) file system used by some Unix variants, which grants access to information about
processes and other operating system features.
Deep space science exploration craft, like
Voyager I &
II used digital tape based special file systems. Most modern space exploration craft like
Cassini-Huygens used
Real-time operating system file systems or RTOS influenced file systems. The
Mars Rovers are one such example of an RTOS file system, important in this case because they are implemented in flash memory.
File systems and operating systems
Most
operating systems provide a file system, as a file system is an integral part of any modern operating system. Early
microcomputer operating systems' only real task was file management — a fact reflected in their names (see
DOS). Some early operating systems had a separate component for handling file systems which was called a
disk operating system. On some microcomputers, the disk operating system was loaded separately from the rest of the operating system. On early operating systems, there was usually support for only one, native, unnamed file system; for example,
CP/M supports only its own file system, which might be called "CP/M file system" if needed, but which didn't bear any official name at all.
Because of this, there needs to be an interface provided by the operating system software between the user and the file system. This interface can be textual (such as provided by a
command line interface, such as the
Unix shell, or
OpenVMS DCL) or graphical (such as provided by a
graphical user interface, such as
file browsers). If graphical, the metaphor of the ''
folder'', containing documents, other files, and nested folders is often used (see also:
directory and
folder).
Flat file systems
In a flat file system, there are no
subdirectories—everything is stored at the same (
root) level on the media, be it a
hard disk,
floppy disk, etc. While simple, this system rapidly becomes inefficient as the number of files grows, and makes it difficult for users to organise data into related groups.
Like many small systems before it, the original
Apple Macintosh featured a flat file system, called
Macintosh File System. Its version of
Mac OS was unusual in that the file management software (
Macintosh Finder) created the illusion of a partially hierarchical filing system on top of MFS. This structure meant that every file on a disk had to have a unique name, even if it appeared to be in a separate folder. MFS was quickly replaced with
Hierarchical File System, which supported real
directories.
File systems under Unix and Linux systems
Unix operating systems create a virtual file system, which makes all the files on all the devices appear to exist in a single hierarchy. This means, in Unix, there is one
root directory, and every file existing on the system is located under it somewhere. Furthermore, the Unix root directory does not have to be in any physical place. It might not be on your first hard drive - it might not even be on your computer. Unix can use a network shared resource as its root directory.
Unix assigns a device name to each device, but this is not how the files on that device are accessed. Instead, to gain access to files on another device, you must first inform the operating system where in the directory tree you would like those files to appear. This process is called
mounting a file system. For example, to access the files on a
CD-ROM, one must tell the operating system "Take the file system from this CD-ROM and make it appear under such-and-such directory". The directory given to the operating system is called the ''
mount point'' - it might, for example, be
/media. The
/media directory exists on many Unix systems (as specified in the
Filesystem Hierarchy Standard) and is intended specifically for use as a mount point for removable media such as CDs, DVDs and like floppy disks. It may be empty, or it may contain subdirectories for mounting individual devices. Generally, only the
administrator (i.e.
root user) may authorize the mounting of file systems.
Unix-like operating systems often include software and tools that assist in the mounting process and provide it new functionality. Some of these strategies have been coined "auto-mounting" as a reflection of their purpose.
#In many situations, file systems other than the root need to be available as soon as the operating system has
booted. All Unix-like systems therefore provide a facility for mounting file systems at boot time. System
administrators define these file systems in the configuration file
fstab, which also indicates options and mount points.
#In some situations, there is no need to mount certain file systems at
boot time, although their use may be desired thereafter. There are some utilities for Unix-like systems that allow the mounting of predefined file systems upon demand.
#Removable media have become very common with
microcomputer platforms. They allow programs and data to be transferred between machines without a physical connection. Common examples include
USB flash drives,
CD-ROMs and
DVDs. Utilities have therefore been developed to detect the presence and availability of a medium and then mount that medium without any user intervention.
#Progressive Unix-like systems have also introduced a concept called 'supermounting'; see, for example,
the Linux supermount-ng project. For example, a floppy disk that has been supermounted can be physically removed from the system. Under normal circumstances, the disk should have been synchronised and then unmounted before its removal. Provided synchronisation has occurred, a different disk can be inserted into the drive. The system automatically notices that the disk has changed and updates the mount point contents to reflect the new medium. Similar functionality is found on standard Windows machines.
#A similar innovation preferred by some users is the use of
autofs, a system that, like supermounting, eliminates the need for manual mounting commands. The difference from supermount, other than compatibility in an apparent greater range of applications such as access to
file systems on network servers, is that devices are mounted transparently when requests to their file systems are made, as would be appropriate for file systems on network servers, rather than relying on events such as the insertion of media, as would be appropriate for removable media.
File systems under Mac OS X
Mac OS X uses a file system that it inherited from
Mac OS called
HFS Plus. HFS Plus is a
metadata-rich and
case preserving file system. Due to the Unix roots of Mac OS X, Unix permissions were added to HFS Plus. Later versions of HFS Plus added
journaling to prevent corruption of the file system structure and introduced a number of optimizations to the allocation algorithms in an attempt to defragment files automatically without requiring an external defragmenter.
Filenames can be up to 255 characters. HFS Plus uses
Unicode to store filenames. On Mac OS X, the
filetype can come from the
type code stored in file's metadata or the filename.
HFS Plus has three kinds of links: Unix-style
hard links, Unix-style
symbolic links and
aliases. Aliases are designed to maintain a link to their original file even if they are moved or renamed; they are not interpreted by the file system itself, but by the File Manager code in
userland.
Mac OS X also supports the
UFS file system, derived from the
BSD Unix Fast File System via
NeXTSTEP.
File systems under Plan 9 from Bell Labs
Plan 9 from Bell Labs was originally designed to extend some of Unix's good points, and to introduce some new ideas of its own while fixing the shortcomings of Unix.
With respect to file systems, the Unix system of treating things as files was continued, but in Plan 9, ''everything'' is treated as a file, and accessed as a file would be (i.e., no
ioctl or
mmap). Perhaps surprisingly, while the file interface is made universal it is also simplified considerably, for example symlinks, hard links and suid are made obsolete, and an atomic create/open operation is introduced. More importantly the set of file operations becomes well defined and subversions of this like ioctl are eliminated.
Secondly, the underlying
9P protocol was used to remove the difference between local and remote files (except for a possible difference in
latency). This has the advantage that a device or devices, represented by files, on a remote computer could be used as though it were the local computer's own device(s). This means that under Plan 9, multiple file servers provide access to devices, classing them as file systems. Servers for "synthetic" file systems can also run in user space bringing many of the advantages of micro kernel systems while maintaining the simplicity of the system.
Everything on a Plan 9 system has an abstraction as a file; networking, graphics, debugging, authentication, capabilities, encryption, and other services are accessed via I-O operations on file descriptors. For example, this allows the use of the IP stack of a gateway machine without need of NAT, or provides a network-transparent window system without the need of any extra code.
Another example: a Plan-9 application receives
FTP service by opening an FTP site. The
ftpfs server handles the open by essentially mounting the remote FTP site as part of the local file system. With ftpfs as an intermediary, the application can now use the usual file-system operations to access the FTP site as if it were part of the local file system. A further example is the mail system which uses file servers that synthesize
virtual files and directories to represent a user mailbox as
/mail/fs/mbox. The
wikifs provides a file system interface to a wiki.
These file systems are organized with the help of private, per-process namespaces, allowing each process to have a different view of the many file systems that provide resources in a distributed system.
The
Inferno operating system shares these concepts with Plan 9.
File systems under Microsoft Windows
Windows makes use of the FAT and
NTFS (New Technology File System) file systems.
The
FAT (File Allocation Table) filing system, supported by all versions of
Microsoft Windows, was an evolution of that used in Microsoft's earlier operating system (
MS-DOS which in turn was based on
86-DOS). FAT ultimately traces its roots back to the shortlived
M-DOS project and
Standalone disk BASIC before it. Over the years various features have been added to it, inspired by similar features found on file systems used by operating systems such as
Unix.
Older versions of the FAT file system (FAT12 and FAT16) had file name length limits, a limit on the number of entries in the root directory of the file system and had restrictions on the maximum size of FAT-formatted disks or
partitions. Specifically, FAT12 and FAT16 had a limit of 8 characters for the file name, and 3 characters for the extension. This is commonly referred to as the
8.3 filename limit. VFAT, which was an extension to FAT12 and FAT16 introduced in
Windows NT 3.5 and subsequently included in Windows 95, allowed long file names (
LFN). FAT32 also addressed many of the limits in FAT12 and FAT16, but remains limited compared to NTFS.
NTFS, introduced with the
Windows NT operating system, allowed
ACL-based permission control. Hard links, multiple file streams, attribute indexing, quota tracking, compression and mount-points for other file systems (called "junctions") are also supported, though not all these features are well-documented.
Unlike many other operating systems, Windows uses a ''drive letter'' abstraction at the user level to distinguish one disk or partition from another. For example, the
path C:WINDOWS represents a directory
WINDOWS on the partition represented by the letter C. The C drive is most commonly used for the primary hard disk partition, on which Windows is installed and from which it boots. This "tradition" has become so firmly ingrained that bugs came about in older versions of Windows which made assumptions that the drive that the operating system was installed on was C. The tradition of using "C" for the drive letter can be traced to MS-DOS, where the letters A and B were reserved for up to two floppy disk drives; in a common configuration, A would be the
3½-inch floppy drive, and B the
5¼-inch one. Network drives may also be mapped to drive letters.
File systems under OpenVMS
Main articles: Files-11
File systems under MVS [IBM Mainframe]
Main articles: MVS#MVS filesystem
See also
★
List of file systems
★
Comparison of file systems
★
Virtual file system
★
File system fragmentation
★
Distributed file system
★
Filesystem API
★
Physical and logical storage
★
List of Unix programs
★
Filename extension
★
Disk sharing
★
File system driver
References
Cited references
General references
★
Disc and volume size limits Jonathan de Boyne Pollard
★
OS/2 corrective service fix JR09427 IBM
★
Attribute - $EA_INFORMATION (0xD0)
★
Attribute - $EA (0xE0)
★
Attribute - $STANDARD_INFORMATION (0x10)
★
Technical Note TN1150: HFS Plus Volume Format Apple Computer Inc
★
File System Forensic Analysis, Brian Carrier, Addison Wesley, 2005.
Further reading
★
Local Filesystems for Windows
★
Understanding File-Size Limits on NTFS and FAT
★
Benchmarking Filesystems Part II using kernel 2.6, by Justin Piszcz, Linux Gazette 122, January 2006
★
Linux File System Benchmarks v2.6 kernel with a stress on CPU usage
★
Interview With the People Behind JFS, ReiserFS & XFS
★
Large List of File System Summaries
★
Filesystems (ext3, ReiserFS, XFS, JFS) comparison on Debian Etch
★
Overview of some filesystems (outdated)
★
Linux large file support (outdated)
★
Sparse files support (outdated)
★
Benchmarking Filesystems (outdated) by Justin Piszcz, Linux Gazette 102, May 2004
★
Journaled Filesystem Benchmarks (outdated): A comparison of ReiserFS, XFS, JFS, ext3 & ext2
★
Journal File System Performance (outdated): ReiserFS, JFS, and Ext3FS show their merits on a fast RAID appliance
★
Linux Filesystem Benchmarks
External links
★
Filesystems of Operating Systems
★
Filesystem Specifications - Links & Whitepapers
★
Interesting File System Projects