Overview

This homework introduces fsck.py, a simple file system simulator. Some
familiarity with the vsfs.py simulator (described in an earlier chapter on
file system implementation) would be useful, but is not necessary.

The file system the tool changes the on-disk state of the file system, and
then leaves it to you, the user, to detect the problem, much like a file
system consistency checker would do (e.g., fsck). The file system itself is
based on a simplified VSFS, the very simple file system described in an
earlier chapter. The tool first generates an on-disk file system; then, either
randomly or through more specific controls, the tool changes the on-disk state
of one aspect of the file system.

The challenge for you, the user, is to figure out which part of the file
system's on-disk state was changed and hence has become inconsistent. In many
cases, the problem should be readily detected; in others, perhaps less so.

Let's get into the details. Here is an example run of the tool:

prompt> ./fsck.py -S 1
...
Final state of file system:

inode bitmap 1100100010000010
inodes       [d a:0 r:4] [f a:-1 r:1] [] [] [d a:8 r:2] [] [] [] 
             [d a:6 r:2] [] [] [] [] [f a:15 r:1] [f a:12 r:1] []
data bitmap  1000001010001001
data         [(.,0) (..,0) (g,8) (t,14) (w,4) (m,13)] [] [] [] [] [] 
             [(.,8) (..,0)] [] [(.,4) (..,0) (p,1)] [] [] [] [z] [] [] [g]

Can you figure out how the file system was corrupted?

prompt> 

In this case, we run the tool with flag -S set to 1 (for reasons
that will be clear later). To understand the change to on-disk state,
and the possible inconsistency, first you have to understand the file
system format. It is (as promised!) very simple.

First, there is an inode bitmap, which marks whether each corresponding inode
is allocated (1) or free (0); in this case, there are 16 inodes, with inodes
0, 1, 5, 9, and 14 marked as allocated.

Second, there are 16 inodes. Unallocated inodes are shown as empty brackets --
[] -- whereas allocated inodes have contents such as [f a:15 r:1] above. Each
allocated inode has three fields. The first is the type of the file, either
'd' for a directory or 'f' for a regular file. The second is a single address
field 'a' which either points to a single data block (in this limited file
system, files/directories can only contain a single block) or has a -1
indicating the file is empty. The third is a reference count; for regular
files, this represents the number of times this file is linked into the file
system; for directories, it represents the number of directories within this
directory.

Third is the data bitmap. Much like the inode bitmap, it indicates whether
each data block in the file system is allocated or free.

Finally, there are the data blocks themselves. For regular files, data block
contents are arbitrary (but usually contain something simple, like a letter;
in real life, these would just contain a block of file data), whereas
directory contents are structured as a list of (name, inode number) pairs.

The root directory in this file system is always inode number 0; it is from
this point that you can start in order to build knowledge of the entire file
system and its contents.

This particular file system contains the following files and directories:

  Directories: '/', '/g', '/w'
  Files:       '/t', '/m', '/w/p'

Can you see why? Spend some time, starting with the root directory, and
see if you can figure it out. If not, well, this homework will be hard for
you.

Now, let's get to the inconsistency part. This particular file system has a
simple specific inconsistency within it. Can you see it?

Once you've spent some time on this, run the tool with the -c flag to see if
you were right:

prompt> ./fsck.py -S 1 -c

Initial state of file system:

inode bitmap 1100100010000110
inodes       [d a:0 r:4] [f a:-1 r:1] [] [] [d a:8 r:2] [] [] [] [d a:6 r:2]
             [] [] [] [] [f a:15 r:1] [f a:12 r:1] []
data bitmap  1000001010001001
data         [(.,0) (..,0) (g,8) (t,14) (w,4) (m,13)] [] [] [] [] [] 
             [(.,8) (..,0)] [] [(.,4) (..,0) (p,1)] [] [] [] [z] [] [] [g]

CORRUPTION::INODE BITMAP corrupt bit 13

Final state of file system:

inode bitmap 1100100010000010
inodes       [d a:0 r:4] [f a:-1 r:1] [] [] [d a:8 r:2] [] [] [] [d a:6 r:2]
             [] [] [] [] [f a:15 r:1] [f a:12 r:1] []
data bitmap  1000001010001001
data         [(.,0) (..,0) (g,8) (t,14) (w,4) (m,13)] [] [] [] [] [] 
             [(.,8) (..,0)] [] [(.,4) (..,0) (p,1)] [] [] [] [z] [] [] [g]

prompt> 

As you can see from the output, the inode bitmap was changed in bit 13,
marking that bit free when actually the inode is allocated. You can see the
inconsistency by looking at inode 13 itself, which looks like this:
[f a:15 r:1] (a regular file, pointing to data block 15). You also know that
this file is in the root directory (/m). Thus, concluding that the inode
bitmap changed here is fairly straightforward.

Many other corruptions are possible (but the tool only introduces one state
change at a time, to keep your life simple). Do the homework to find out
more about them.

The other flags available for the tool are:

prompt> ./fsck.py -h
Options:
  -h, --help            show this help message and exit
  -s SEED, --seed=SEED  first random seed (for a filesystem)
  -S SEEDCORRUPT, --seedCorrupt=SEEDCORRUPT
                        second random seed (for corruptions)
  -i NUMINODES, --numInodes=NUMINODES
                        number of inodes in file system
  -d NUMDATA, --numData=NUMDATA
                        number of data blocks in file system
  -n NUMREQUESTS, --numRequests=NUMREQUESTS
                        number of requests to simulate
  -p, --printFinal      print the final set of files/dirs
  -w WHICHCORRUPT, --whichCorrupt=WHICHCORRUPT
                        do a specific corruption
  -c, --compute         compute answers for me
  -D, --dontCorrupt     don't actually corrupt file system

The reason for two random seeds is worth describing. The first one, -s,
creates different random file systems. The second one, -S, inserts different
file system state changes (corruptions). Thus, you can generate one particular
random file system and then try a bunch of different random corruptions upon
it.

Some other flags just control the size of the file system (-i and -d), whereas
others determine how many file operations are run against the file system
(-n). The -p flag prints the contents of the (non-corrupted) file system; the
-w flag allows you to specify a particular corruption (although this is mostly
for testing); the -D flag turns off corruption to, allowing you to examine an
intact file system.