Understand File Systems

Understand Various File Systems of Windows, Linux, and macOS Operating Systems

Understanding File Systems

  • File systems use a hierarchical structure of files and directories (folders) to logically organize data. This makes it easier for users to navigate and find specific files.
  • They define a format for specifying file locations using paths that trace through the directory structure.
  • File systems are organized as tree-like structures with directories nested within each other. Access to these directories can be controlled through authorization mechanisms.
  • File systems handle storage, hierarchical categorization, management, navigation, access, and recovery of data.
  • Common file systems include FAT, NTFS, HFS, HFS+, APFS, Ext2, Ext3, and Ext4.

 

Windows File Systems

 

File Allocation Table (FAT)

  • The FAT file system is used with DOS (Disk Operating System), and it was the first file system used with the Windows OS
  • It is named for its method of organization, the file allocation table, which resides at the beginning of the volume
  • FAT contains three different versions (FAT12, FAT16, and FAT32) and differs due to the size of the entries in the FAT structure

FAT32

  • The FAT32 file system is derived from an FAT file system and supports drives up to 2 terabytes in size.
  • It uses drive space efficiently and uses small clusters.
  • It takes backup of the file allocation table instead of the default copy.

exFAT

 exFAT is more compatible across various OSes compared to earlier FAT-based file systems. It is compatible with Windows 10 and later versions, macOS X and all later versions, and Linux kernel 5.4 or newer versions.

Some of the advantages of the exFAT file system are as follows:

  • Ability to manage storage efficiently.
  • It has no restrictions on the partition size or file size of the volume.
  • Supports data recovery.
  • Highly compatible with most of the latest versions of OSes and storage devices.
  • Best for external storage devices.

New Technology File System (NTFS)

NTFS is one of the latest file systems supported by Windows. It is a high-performance file system that repairs itself; it supports advanced features such as file-level security, compression, and auditing. It also supports large volume storage solutions such as self-recovering disks.

  • NTFS is the standard file system of Windows NT and its descendants Windows XP, Vista, 7, 8.1,10, server 2003, server 2008, and server 2012 etc.
  • From Windows NT 3.1, it is the default file system of Windows NT family.
  • It has several improvements over FAT such as improved support for metadata and the use of advanced data structures to improve performance, reliability, and disk space utilization plus additional extensions such as security access control lists.
  • It provides data security on both removable and fixed disks.
  • It allows storing and transferring large or multiple files.

NTFS Architecture

The MBR is created by the system when the file system’s volume is formatted. Information about the hard disk’s partition table and some executable code known as the master boot code are included. The master boot record launches the executable master boot code whenever a new volume is mounted. Each file under NTFS has its own unique properties and is kept in clusters. File properties include things like the file’s name, size, and contents. As a result, the internal organization of NTFS is like that of a database, where the operating system treats every file as an object.

Components of the NTFS architecture are as follows:

  • Hard disk: It has a minimum of one partition.
  • Master boot record: It includes executable master boot code that is loaded into memory by the BIOS of the computer system. This function locates the partition table and determines which partition is bootable and active by scanning the MBR.
  • Boot sector: Also referred to as the volume boot record (VBR), this is the first sector in an NTFS filesystem that contains the boot code and additional data, including the kind, location, and amount of the data.
  • Ntldlr.dll: As a boot loader, it accesses the NTFS filesystem and loads the contents of the boot.ini file.
  • Ntfs.sys: It is a computer system file driver for NTFS.
  • Kernel mode: It is the processing mode that permits the executable code to have direct access to all the system components.
  • User mode: It is the processing mode in which an executable program or code runs.

NTFS Master File Table (MFT)

  • A Unique file known as the master file table (MFT) contains records for every file on an NTFS drive.
  • The first 16 records of the table are set aside for special information.
  • The first record of this table describes the master file table itself, followed by an MFT mirror record.
  • If the first MFT record is corrupted, NTFS reads the second record to find the MFT mirror file, whose first record is same as the first record of the MFT.
  • The boot sector contains the locations of the data segments for both the MFT and MFT mirror files; a duplicate of the boot sector is situated at the logical center of the disk.
  • The log file, which is utilized for file recovery, is the third entry in the MFT. Each file and directory on the volume (which NTFS also views as a file) is represented by the seventeenth and subsequent records of the master file table.

Encrypting File Systems (EFS)

  • Version 3.0 of NTFS introduced the Encrypting File System (EFS), which provides filesystem-level encryption.
  • This encryption technology preserves a degree of transparency for the user who encrypted the file, meaning users can access it and make changes without having to decrypt it.
  • The encryption policy is immediately restored once a user has finished using the file; unauthorized users are prevented from accessing encrypted files.
  • A user must configure the encryption properties of the files and folders they wish to encrypt or decrypt to activate the encryption and decryption features.

Components of EFS

CryptoAPI

Application developers can use CryptoAPI’s collection of functions to encrypt their Win32 programs. These functions enable applications to digitally sign or encrypt data and provide protection for private key data.
Both public-key and symmetric-key operations are supported, including digital signatures, hashing, encryption, decryption, exchange, management, and safe storage, as well as signature verification.

 

Resilient File System (ReFS)

The most recent file system, ReFS, was created by Microsoft specifically for Windows OS and follows the NTFS file system. It effectively scales massive data collections across a variety of workloads and improves data availability. Data integrity and resilience to corruption can be achieved with this file system. It is compatible with Windows 8.1, Windows Server 2019, and subsequent versions. ReFS was just released by Microsoft on Windows 11 build 25276.

Many of the NTFS file system’s characteristics were carried over to ReFS, which also solved problems, particularly with data integrity, data corruption, and managing large data volumes. To manage data integrity, ReFS employs a B+ tree structure with root, internal, and leaf nodes. To prevent data corruption during a power outage, it employs an allocation-on-write technique.

 

 

The following are various features available in ReFS:

  • Integration with storage spaces: By utilizing the backup copy of data and integrating with storage space, ReFS can automatically identify and fix faulty data.
  • Data salvaging: By keeping the data volume online, ReFS eliminates faulty, non-correctable data from the namespace if a replacement copy of the data is not available.
  • Proactive error correction: ReFS has a scrubber, an integrated scanner that periodically examines the volume to find and fix dormant corruptions.
  • Mirror-accelerated parity: By employing the storage spaces direct feature, which generates volumes that use both mirror and parity resiliencies, ReFS provides affordable and space-efficient storage without compromising speed.
  • Block cloning: ReFS has a feature called block cloning that expedites copy operations and checkpoint merges. Instead of reading and writing to file data, it makes copies as a low-cost metadata activity.
  • Sparse VDL: ReFS can quickly zero files thanks to this feature, cutting down the time required to build fixed VHDs from tens of minutes to just a few seconds.
  • File-level snapshots: ReFS creates a new file with the original data to enable the file-level snapshots capability. Numerous data backup issues are resolved by it, including data protection, efficiency, excellent performance, and corrupted data recovery.
  • Integrity streams: ReFS offers an optional capability, which uses checksums to verify and preserve data integrity. This feature makes it obvious if data is faulty or legitimate if the user activates it. 

ReFS B+ Tree Structure

The file system components of ReFS are arranged in either single-level (leaves) or multilevel (B+ trees) B+ tree structures. Any component of the file system can scale thanks to the B+ tree’s organizational structure, which makes data recovery and storage more efficient. This architecture expedites access times and aids in the efficient management of massive volumes of data.

 

 

Components of the B+ tree structure in the ReFS architecture are as follows:

Root: The node at the top of the B+ tree architecture, known as the root, oversees controlling access to various parts of the file system architecture.

Internal nodes: These are the nodes that act as intermediaries, connecting to child nodes or data blocks and providing index information that guides travel through the tree.

Leaf node: The node that stores metadata pertaining to files and directories in ReFS is called the leaf node, and it is situated at the base of the B+ tree file system architecture. Every leaf node has a set of keys that correspond to the file or directory names as well as the pertinent metadata.

Linux File Systems

Linux OS stores data in many file systems. Investigators should be well-versed in the storage techniques used by Linux since they may come across attack sources or victim systems running Linux.

 

Linux File System Architecture

The two components of the Linux file system architecture are as follows:

  1. User space: This is the protected memory region that houses the available memory and is where user processes operate.
  2. Kernel space: This is the memory area where all kernel functions are provided by the system via kernel processes. This area can only be accessed by users via a system call. Only when a user process makes a system call does it become a kernel process.

The system call interface that links the kernel to user-space applications is provided by the GNU C Library (glibc), which is located between user and kernel space.

 

An abstract layer on top of a full file system is called a virtual file system (VFS). It makes a variety of file systems accessible to client programs. A dispatching layer, which offers filesystem abstraction, and several caches to improve filesystem operation performance make up its underlying design.

 

Device drivers are bits of code that are connected to all real or virtual devices that aid the operating system in controlling the hardware. Setting up hardware, getting associated devices in and out of services, getting data from hardware and sending it to the kernel, sending data from the kernel to the device, and detecting and managing device problems are all tasks performed by device drivers.

Filesystem Hierarchy Standard (FHS)

Linux’s file system is represented as a single entity with a single hierarchical tree structure.

  • Linux and Unix-like operating systems’ directory structure and contents are defined by the Filesystem Hierarchy Standard (FHS).
  • Every file and directory in the FHS are located under the root directory, which is denoted by /.

Extended File System (EXT)

  • First file system for the Linux operating system to overcome certain limitations of the Minix file system.
  • It has a maximum partition size of 2 GB and a maximum file name size of 255 characters.
  • A 64 MB partition size and short file names are the two main Minix file system limitations that are eliminated by this file system.
  • Its main drawback is that it doesn’t support separate access, inode modification, and data modification time stamps.
  • It is being replaced by the second extended file system.

Second Extended File System (EXT2)

  • EXT2 is a common file system that retains more time stamps and employs better algorithms, which dramatically increases its speed.
  • Its main drawbacks are that it is not a journaling file system and that there is a chance of file system damage when writing to EXT2.
  • It also maintains a unique field in the superblock that tracks the file system status and designates it as clean or dirty.

Third Extended File System (Ext3)

  • Ext3 is a journaling version of the EXT2 file system and is greatly used with the Linux operating system. It is an enhanced version of the EXT2 file system.
  • Using file system maintenance tools (such as fsck) to maintain repair the EXT2 file system.
  • To convert EXT2 to EXT3 file system, entre the following command: # /sbin/tune2fs -j <partition-name>

Ext3 Features

Data Integrity: It offers enhanced data integrity for incidents brought on by computer system failures.

 

Speed: The EXT3 file system typically has a higher throughout than the EXT2 file system because it is journaling the file system.

 

Easy Transition: The user can easily change the file system from EXT2 to EXT3 and increase the performance of the system.

 

 

Fourth Extended File System (EXT4)

  • EXT4 is a journaling file system, developed as the replacement of the commonly used EXT3 file system.
  • With the addition of new capabilities, EXT4 provides substantial improvements over EXT3 and EXT2 file systems particularly in terms of performance, scalability, and dependability.
  • Compatible with Linux Kernel v2.6.19 and later.

Key Features:

  • File System Size – supports a maximum of 16TB for individual file and 1EB (exabyte) for the entire EXT4 file system.
  • Extents – replaces block mapping scheme used by EXT2 and EXT3, improving large file performance and reducing fragmentation.
  • Delayed allocation – improves performance and reduces fragmentation by effectively allocating larger amounts of data at a time
  • Multi-block allocation – this method distributes files on the disk contiguously.
  • fsck speed – facilitates quicker file system inspection.
  • Journal checksumming – this technique increases dependability by using checksums in the journal.
  • Constant preallocation – this method preallocates a file’s on-disk space.
  • Better Timestamps – this feature shows timestamps in nanoseconds.
  • Backward compatibility – allows EXT3 and EXT2 to be mounted as EXT4.

Mac OS X File Systems

UNIX is the foundation of Apple’s macOS, which stores data differently from Windows and Linux. Therefore, macOS cannot be used with the forensic methods that are often used for Windows and Linux. To conduct forensic analyses on macOS file systems, forensic investigators need to have a thorough understanding of UNIX-based systems.

 

Hierarchical File System (HFS)

  • Apple created the Hierarchical File System, commonly known as Mac OS Standard, in 1985 for the MAC operating system.
  • It organizes files into folders and groups directories with one another.
  • Drives, folders, and files are shown in groupings.
  • It splits a logical volume into 512-byte logical pieces.

Hierarchical File System Plus (HFS+)

  • HFS+, Macintosh’s primary file system, is the replacement for HFS.
  • It utilizes Unicode to name things (files and directories) and supports huge files.
  • The Apple iPod uses this format, which is also known as Mac OS Extended (HFS Extended).
  • Users of HFS Plus can:
  • Make effective use of hard drive space
  • Use only file names that are compatible with other countries.
  • Boot non-Mac OS operating systems.

 

Apple File System (APFS)

For iOS 10.3 and later, APFS takes the place of HFS+ as the default file system. Among the many advantages of this improved file system are atomic safe-save primitives, snapshots, sharing of free space between volumes, support for sparse files, cloning (without consuming extra disk space), and quick directory scaling. All Apple operating systems, including watchOS, tvOS, macOS, and iOS, use it. This next-generation file system is made to benefit from native encryption capabilities and flash/SSD storage systems.

APFS is made up of two layers:

  • The container layer: It holds higher-level data such volume metadata, encryption state, and volume snapshot and arranges data on the filesystem layer.
  • The filesystem layer: It is made up of data structures that hold information like directory hierarchies, file metadata, and file content.

TRIM operations, sparse files, expanded file attributes, speedy directory scaling, snapshots, cloning, high timestamp granularity, faster multi-key encryption, and the copy-on-write metadata functionality are all supported by the APFS file system.

The shortcomings of the previous file system, HFS+, such as its restricted capacity, low security, lack of functionality, and incompatibility with SSDs, are all addressed by APFS.

 

Drawbacks

It is challenging for users to move files from an APFS disk to older macOS devices since APFS-formatted drives are incompatible with macOS 10.11 Yosemite and previous versions. HDDs cannot use the APFS file system because of the “copy-on-write” feature and “fragmentation” of copied files. Although checksums are used by APFS, they are only used to guarantee the integrity of information, not user data. The absence of support for non-volatile RAM (NVRAM), compression, and Apple Fusion Drives are some of APFS’s other shortcomings.

 

Understand File System Analysis

Investigators can learn more about different deleted files, hidden files, and other questionable data kept on storage media by analyzing the file system when investigating. This could assist them in gathering pertinent evidence to carry out the remainder of the investigation efficiently and, if necessary, present it in court.

 

CD-ROM/DVD File System

  • The file system for CD-ROM media is defined by ISO (International Organization for Standardization) 9660.
  • It supports several computers operating systems, including UNIX-based systems, Mac OS, and Microsoft Windows, for data sharing.
  • Common additions to ISO 9660 addressed the following restrictions: 1. Rock Ridge facilitates UNIX permissions and longer ASCII coded names. 2. Joliet also supports Unicode naming, like non-Roman scripts. 3. El Torito facilitates bootable CDs.
  • ISO 13490 combines multisession support with ISO 9660.
  • On CD-ROM and Digital Versatile Disk (DVD), Windows supports two different file system types: 1. Compact Disc File System (CDFS)  2. Universal Disk Format (UDF)

Compact Disc File System (CDFS)

  • The Linux operating system’s CD File System (CDFS) is a file system.
  • All tracks and boot images are transferred as regular files on a CD.
  • It makes information in old ISO pictures accessible.

Virtual File System (VFS) and Universal Disk Format File System (UDF)

Virtual File System (VFS)

  • Programming that defines an interface between the various file systems and the OS kernel is known as a VFS.
  • VFS provides client applications with methodical access to the different concrete file systems. For instance, the client application offers transparent access to both local and network storage devices without causing any discernible differences.
  • Examples of VFSs include Oracle Clustered File System (OCFS), New Technology File System (NTFS), Global File System (GFS), VMware Virtual Machine File System (VMFS).

Universal Disk Format File System (UDF)

  • UDF was developed by the Optical Storage Technology Association (OSTA) with the goal of replacing the ISO9660 file system with optical media and FAT on removable media.
  • Based on ISO/IEC 13346 and ECMA-167 standards, UDF is an open-source file system that specifies how data is stored and interchanged on a wide range of optical media.

Understand Storage Systems

Storage systems such as redundant array of independent disks (RAID) and Just a Bunch of Drives/Disks (JBOD) connect multiple HDDs to increase the storage capacity of a system

 

RAID Storage System

  • The technology known as Redundant Array of Independent Disks (RAID) uses several smaller disks working together as a single large volume.
  • By offering a specific way to access one or more distinct hard drives, it improves access time and lowers the possibility of losing all data in the event of hard disk failure or damage.
  • This technique was created to:
  • Maintain a significant amount of data storage.
  • Improve input/output performance
  • Increase dependability by utilizing data redundancy
 

Levels of RAID Storage System

Raid 0

       Data is written evenly across several hard disks after being divided into blocks.

       By distributing the I/O load across numerous channels and disk drives, it enhances I/O performance.

       Data recovery is impossible if a disk fails.

       Data redundancy is not provided by it.

       At least two drives are needed for setup.

 

RAID 1

  • It is intended to recover data in the case of disk failure and consists of two disks for every volume.
  • The two disks contain the same information.
  • It helps to avoid computer outages and guarantees that data is not lost.

RAID 2

  • By setting up two or more drives as a single huge volume, such as RAID 1, it offers quick access and additional capacity.
  • Information is stored on a disk at a slightly different level.
  • To confirm that the writing was successful, error correcting code (ECC) is utilized.
  • It is slower than RAID 0 and provides superior data verification.

RAID 3

  • It requires a minimum of three disks and employs data stripping and dedicated parity.
  • Several drives are used to strip data at the byte level, and one drive is designated to hold parity information.
  • The parity drive can be used for data recovery and error correction if a drive fails.

RAID 5

  • All member drives receive parity information after data is striped at the byte level across several drives.
  • The writing of data is slow.
  • At least three drives are needed for setup.

RAID 10 or Mirrored Striping

  • It takes a minimum of four disks to implement and combines RAID 0 (Striping Volume data) and RAID 1 (Disk Mirroring).
  • It has the same overheads as mirroring alone and the same failure tolerance as RAID level 1.
  • It enables disks to be mirrored in pairs for redundancy and enhanced performance. For optimal performance, data is then striped across several disks.

 

Host Protected Areas (HPA) and Device Configuration Overlays (DCO)

  • The hidden parts of a hard drive are called Host Protected Areas (HPA), and Device Configuration Overlays (DCO).

         HPA:

          –> The HDD’s HPA is a designated space for data storage that cannot be altered, changed, or accessed by the                   user, BIOS, or OS.

           –> This section contains information about boot sector codes, diagnostic tools, HDD utilities, etc.

          DCO:

          –> Modern hard disks contain an extra hidden region called DCO that allows system suppliers to purchase                        HDDs of various capacities from various manufacturers and set them up to have an identical number of                      sectors.                                                                                                                                                                                             –> It can also be used to activate or deactivate HDD functionalities.

  • Hackers utilize specific tools to alter and write to the HDD’s HPA and DCO sections with the intension of concealing information.
  • Because many technologies are unable to identify their presence, HPA and DCO zones are problematic during the inquiry.
  • To identify and photograph HPA and DCO areas, investigators like EnCase, TAFT forensics tool, Sleuth Kit, etc.

Network-Attached Storage (NAS)

One or more servers with several dedicated hard drives in a RAID arrangement for redundancy make up a NAS, which is a centralized storage system. It uses a shared network to store and distribute data across multiple customers. It has its own distinct IP address, is a part of the local area network (LAN) as an independent network node and can connect to shared storage devices using a regular Ethernet connection.

NAS facilitates efficient data exchange across numerous clients situated remotely or in different time zones when working in teams. Clients can easily access files or folders from any network-connected device by connecting NAS to a wireless router or switch.

 

 

The following are the features of NAS:

  • It enables file sharing across networks via popular file-sharing protocols like SMB/CIFS (Windows), NFS (Unix/Linux), and AFP (Apple).
  • It streamlines data management and offers a centralized location to store files, documents, media, and backups.
  • The administrator can use it to plan when to backup critical data to the NAS.
  • It can be accessed remotely.
  • Files on the NAS can be accessed and edited by multiple people at once.
  • NAS systems can be readily expanded by adding more hard drives.
  • NAS equipment is equipped with security measures like firewalls, authentication, and data encryption to guard against unwanted access.

The following three types of NAS devices are distinguished by the number of drives, drive support, drive capacity, and scalability.

  1. High-end or enterprise NAS: Businesses that store a lot of file data, including virtual machine images, are the ones that drive enterprise NAS. NAS clustering is another feature it offers.
  1. Mid-market NAS: Businesses that need several hundred terabytes of data can use this kind of NAS. In the event of a system breakdown, it offers a point in time snapshot feature to guarantee data protection, Ot is not possible to cluster the devices.
  1. Desktop or Low-end NAS: These systems are utilized by small organizations that need local shared storage. Additionally, it provides easier backup and quicker data access.

Storage Area Network (SAN)

A SAN is a specialized network that runs quickly to give network access to block-level data storage. Since SAN is a network unto itself, it is unaffected by network traffic, including LAN bottlenecks. Multiple servers can access the network of storage devices as an attached drive thanks to its architecture. As if it were a local hard drive, servers can access the data that is shared over several disk arrays. It may span several different sites or places.

Fiber channel (FC) technology, which enables high data rates and continuous data access, can be used to connect storage devices, multiple switches, and networked hosts, which make up SAN. Small and mid-sized businesses employ Ethernet-based Internet Small Computer Systems Interface (iSCSI), a less expensive substitute for FC SANs due to their high cost and complexity.

By encapsulating SCSI commands into IP packets that do not require an FC connection, Ethernet-based iSCSI lessens the difficulties associated with FC technology.

 

 

refernce

1. Ec-Council CHFI Ebook