Friday, April 18, 2008
In computing, file system fragmentation, sometimes called file system aging, is the inability of a file system to lay out related data sequentially (contiguously), an inherent phenomenon in storage-backed file systems that allow in-place modification of their contents. It is a special case of data fragmentation. File system fragmentation increases disk head movement or seeks, which are known to hinder throughput. The correction to existing fragmentation is to compress files and free space back into contiguous areas, a process called defragmentation.
Why fragmentation occurs
File system fragmentation is projected to become more problematic with newer hardware due to the increasing disparity between sequential access speed and rotational delay (and to a lesser extent seek time), of consumer-grade hard disks,
Performance implications
File system fragmentation may occur on several levels:
Fragmentation within individual files and their metadata.
Free space fragmentation, making it increasingly difficult to lay out new files contiguously.
The decrease of locality of reference between separate, but related files. Types of fragmentation
Individual file fragmentation occurs when a single file has been broken into multiple pieces (called extents on extent-based file systems). While disk file systems attempt to keep individual files contiguous, this is not often possible without significant performance penalties. File system check and defragmentation tools typically only account for file fragmentation in their "fragmentation percentage" statistic.
File fragmentation
Free (unallocated) space fragmentation occurs when there are several unused areas of the file system where new files or metadata can be written to. Unwanted free space fragmentation is generally caused by deletion or truncation of files, but file systems may also intentionally insert fragments ("bubbles") of free space in order to facilitate extending nearby files (see proactive techniques below).
Free space fragmentation
Related file fragmentation, also called application-level (file) fragmentation, refers to the lack of locality of reference between related files. Unlike the previous two types of fragmentation, related file fragmentation is a much more vague concept, as it heavily depends on the access pattern of specific applications. This also makes objectively measuring or estimating it very difficult. However, arguably, it is the most critical type of fragmentation, as studies have found that the most frequently accessed files tend to be small compared to available disk throughput per second.
To avoid related file fragmentation and improve locality of reference, assumptions about the operation of applications have to be made. A very frequent assumption made is that it is worthwhile to keep smaller files within a single directory together, and lay them out in the natural file system order. While it is often a reasonable assumption, it does not always hold. For example, an application might read several different files, perhaps in different directories, in the exact same order they were written. Thus, a file system that simply orders all writes successively, might work faster for the given application.
Related file fragmentation
Several techniques have been developed to fight fragmentation. They can usually be classified into two categories: proactive and retroactive. Due to the hard predictability of access patterns, these techniques are most often heuristic in nature, and may degrade performance under unexpected workloads.
Techniques for mitigating fragmentation
Proactive techniques attempt to keep fragmentation at a minimum at the time data is being written on the disk. The simplest of such is, perhaps, appending data to an existing fragment in place where possible, instead of allocating new blocks to a new fragment.
Many of today's file systems attempt to preallocate longer chunks, or chunks from different free space fragments, called extents to files that are actively appended to. This mainly avoids file fragmentation when several files are concurrently being appended to, thus avoiding them from becoming excessively intertwined.
Bittorrent and other peer-to-peer filesharing clients have an "Antifragmentation" feature that allocates the full space needed for a file when initiating downloads.
Proactive techniques
Retroactive techniques attempt to reduce fragmentation, or the negative effects of fragmentation, after it has occurred. Many file systems provide defragmentation tools, which attempt to reorder fragments of files, and often also increase locality of reference by keeping smaller files in directories, or directory trees, close to each other on the disk.
The HFS Plus file system transparently defragments files that are less than 20 MiB in size and are broken into 8 or more fragments, when the file is being opened.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment