blob: 3851e360d817dbcf3e3f4a9acdc1d36d10a31d46 [file] [log] [blame]
William Juul0e8cc8b2007-11-15 11:13:05 +01001Welcome to YAFFS, the first file system developed specifically for NAND flash.
2
3It is now YAFFS2 - original YAFFS (AYFFS1) only supports 512-byte page
4NAND and is now deprectated. YAFFS2 supports 512b page in 'YAFFS1
5compatibility' mode (CONFIG_YAFFS_YAFFS1) and 2K or larger page NAND
6in YAFFS2 mode (CONFIG_YAFFS_YAFFS2).
7
8
9A note on licencing
10-------------------
Wolfgang Denk4b070802008-08-14 14:41:06 +020011YAFFS is available under the GPL and via alternative licensing
William Juul0e8cc8b2007-11-15 11:13:05 +010012arrangements with Aleph One. If you're using YAFFS as a Linux kernel
13file system then it will be under the GPL. For use in other situations
14you should discuss licensing issues with Aleph One.
15
16
17Terminology
18-----------
19Page - NAND addressable unit (normally 512b or 2Kbyte size) - can
Wolfgang Denk4b070802008-08-14 14:41:06 +020020 be read, written, marked bad. Has associated OOB.
William Juul0e8cc8b2007-11-15 11:13:05 +010021Block - Eraseable unit. 64 Pages. (128K on 2K NAND, 32K on 512b NAND)
22OOB - 'spare area' of each page for ECC, bad block marked and YAFFS
Wolfgang Denk4b070802008-08-14 14:41:06 +020023 tags. 16 bytes per 512b - 64 bytes for 2K page size.
William Juul0e8cc8b2007-11-15 11:13:05 +010024Chunk - Basic YAFFS addressable unit. Same size as Page.
25Object - YAFFS Object: File, Directory, Link, Device etc.
26
27YAFFS design
28------------
29
30YAFFS is a log-structured filesystem. It is designed particularly for
31NAND (as opposed to NOR) flash, to be flash-friendly, robust due to
32journalling, and to have low RAM and boot time overheads. File data is
33stored in 'chunks'. Chunks are the same size as NAND pages. Each page
34is marked with file id and chunk number. These marking 'tags' are
35stored in the OOB (or 'spare') region of the flash. The chunk number
36is determined by dividing the file position by the chunk size. Each
37chunk has a number of valid bytes, which equals the page size for all
38except the last chunk in a file.
39
40File 'headers' are stored as the first page in a file, marked as a
41different type to data pages. The same mechanism is used to store
42directories, device files, links etc. The first page describes which
43type of object it is.
44
45YAFFS2 never re-writes a page, because the spec of NAND chips does not
46allow it. (YAFFS1 used to mark a block 'deleted' in the OOB). Deletion
47is managed by moving deleted objects to the special, hidden 'unlinked'
48directory. These records are preserved until all the pages containing
49the object have been erased (We know when this happen by keeping a
50count of chunks remaining on the system for each object - when it
Wolfgang Denk4b070802008-08-14 14:41:06 +020051reaches zero the object really is gone).
William Juul0e8cc8b2007-11-15 11:13:05 +010052
53When data in a file is overwritten, the relevant chunks are replaced
54by writing new pages to flash containing the new data but the same
Wolfgang Denk4b070802008-08-14 14:41:06 +020055tags.
William Juul0e8cc8b2007-11-15 11:13:05 +010056
Wolfgang Denk4b070802008-08-14 14:41:06 +020057Pages are also marked with a short (2 bit) serial number that
58increments each time the page at this position is incremented. The
59reason for this is that if power loss/crash/other act of demonic
60forces happens before the replaced page is marked as discarded, it is
61possible to have two pages with the same tags. The serial number is
William Juul0e8cc8b2007-11-15 11:13:05 +010062used to arbitrate.
63
Wolfgang Denk4b070802008-08-14 14:41:06 +020064A block containing only discarded pages (termed a dirty block) is an
William Juul0e8cc8b2007-11-15 11:13:05 +010065obvious candidate for garbage collection. Otherwise valid pages can be
Wolfgang Denk4b070802008-08-14 14:41:06 +020066copied off a block thus rendering the whole block discarded and ready
67for garbage collection.
68
William Juul0e8cc8b2007-11-15 11:13:05 +010069In theory you don't need to hold the file structure in RAM... you
70could just scan the whole flash looking for pages when you need them.
71In practice though you'd want better file access times than that! The
Wolfgang Denk4b070802008-08-14 14:41:06 +020072mechanism proposed here is to have a list of __u16 page addresses
William Juul0e8cc8b2007-11-15 11:13:05 +010073associated with each file. Since there are 2^18 pages in a 128MB NAND,
74a __u16 is insufficient to uniquely identify a page but is does
75identify a group of 4 pages - a small enough region to search
76exhaustively. This mechanism is clearly expandable to larger NAND
77devices - within reason. The RAM overhead with this approach is approx
782 bytes per page - 512kB of RAM for a whole 128MB NAND.
79
Wolfgang Denk4b070802008-08-14 14:41:06 +020080Boot-time scanning to build the file structure lists only requires
William Juul0e8cc8b2007-11-15 11:13:05 +010081one pass reading NAND. If proper shutdowns happen the current RAM
82summary of the filesystem status is saved to flash, called
83'checkpointing'. This saves re-scanning the flash on startup, and gives
Wolfgang Denk4b070802008-08-14 14:41:06 +020084huge boot/mount time savings.
William Juul0e8cc8b2007-11-15 11:13:05 +010085
86YAFFS regenerates its state by 'replaying the tape' - i.e. by
87scanning the chunks in their allocation order (i.e. block sequence ID
88order), which is usually different form the media block order. Each
89block is still only read once - starting from the end of the media and
Wolfgang Denk4b070802008-08-14 14:41:06 +020090working back.
William Juul0e8cc8b2007-11-15 11:13:05 +010091
92YAFFS tags in YAFFS1 mode:
93
9418-bit Object ID (2^18 files, i.e. > 260,000 files). File id 0- is not
95 valid and indicates a deleted page. File od 0x3ffff is also not valid.
96 Synonymous with inode.
972-bit serial number
9820-bit Chunk ID within file. Limit of 2^20 chunks/pages per file (i.e.
99 > 500MB max file size). Chunk ID 0 is the file header for the file.
10010-bit counter of the number of bytes used in the page.
10112 bit ECC on tags
102
103YAFFS tags in YAFFS2 mode:
104 4 bytes 32-bit chunk ID
105 4 bytes 32-bit object ID
106 2 bytes Number of data bytes in this chunk
107 4 bytes Sequence number for this block
108 3 bytes ECC on tags
109 12 bytes ECC on data (3 bytes per 256 bytes of data)
110
111
Wolfgang Denk4b070802008-08-14 14:41:06 +0200112Page allocation and garbage collection
113
114Pages are allocated sequentially from the currently selected block.
115When all the pages in the block are filled, another clean block is
116selected for allocation. At least two or three clean blocks are
117reserved for garbage collection purposes. If there are insufficient
118clean blocks available, then a dirty block ( ie one containing only
William Juul0e8cc8b2007-11-15 11:13:05 +0100119discarded pages) is erased to free it up as a clean block. If no dirty
Wolfgang Denk4b070802008-08-14 14:41:06 +0200120blocks are available, then the dirtiest block is selected for garbage
121collection.
122
123Garbage collection is performed by copying the valid data pages into
124new data pages thus rendering all the pages in this block dirty and
125freeing it up for erasure. I also like the idea of selecting a block
William Juul0e8cc8b2007-11-15 11:13:05 +0100126at random some small percentage of the time - thus reducing the chance
127of wear differences.
128
129YAFFS is single-threaded. Garbage-collection is done as a parasitic
130task of writing data. So each time some data is written, a bit of
131pending garbage collection is done. More pages are garbage-collected
Wolfgang Denk4b070802008-08-14 14:41:06 +0200132when free space is tight.
William Juul0e8cc8b2007-11-15 11:13:05 +0100133
134
135Flash writing
136
137YAFFS only ever writes each page once, complying with the requirements
138of the most restricitve NAND devices.
139
140Wear levelling
141
142This comes as a side-effect of the block-allocation strategy. Data is
143always written on the next free block, so they are all used equally.
144Blocks containing data that is written but never erased will not get
145back into the free list, so wear is levelled over only blocks which
Wolfgang Denk4b070802008-08-14 14:41:06 +0200146are free or become free, not blocks which never change.
William Juul0e8cc8b2007-11-15 11:13:05 +0100147
148
149
150Some helpful info
151-----------------
152
153Formatting a YAFFS device is simply done by erasing it.
154
155Making an initial filesystem can be tricky because YAFFS uses the OOB
156and thus the bytes that get written depend on the YAFFS data (tags),
157and the ECC bytes and bad block markers which are dictated by the
158hardware and/or the MTD subsystem. The data layout also depends on the
159device page size (512b or 2K). Because YAFFS is only responsible for
160some of the OOB data, generating a filesystem offline requires
161detailed knowledge of what the other parts (MTD and NAND
162driver/hardware) are going to do.
163
164To make a YAFFS filesystem you have 3 options:
165
1661) Boot the system with an empty NAND device mounted as YAFFS and copy
167 stuff on.
168
1692) Make a filesystem image offline, then boot the system and use
170 MTDutils to write an image to flash.
171
1723) Make a filesystem image offline and use some tool like a bootloader to
173 write it to flash.
174
175Option 1 avoids a lot of issues because all the parts
176(YAFFS/MTD/hardware) all take care of their own bits and (if you have
177put things together properly) it will 'just work'. YAFFS just needs to
178know how many bytes of the OOB it can use. However sometimes it is not
179practical.
180
181Option 2 lets MTD/hardware take care of the ECC so the filesystem
182image just had to know which bytes to use for YAFFS Tags.
183
184Option 3 is hardest as the image creator needs to know exactly what
185ECC bytes, endianness and algorithm to use as well as which bytes are
Wolfgang Denk4b070802008-08-14 14:41:06 +0200186available to YAFFS.
William Juul0e8cc8b2007-11-15 11:13:05 +0100187
188mkyaffs2image creates an image suitable for option 3 for the
189particular case of yaffs2 on 2K page NAND with default MTD layout.
190
191mkyaffsimage creates an equivalent image for 512b page NAND (i.e.
192yaffs1 format).
193
194Bootloaders
195-----------
196
197A bootloader using YAFFS needs to know how MTD is laying out the OOB
Wolfgang Denk4b070802008-08-14 14:41:06 +0200198so that it can skip bad blocks.
William Juul0e8cc8b2007-11-15 11:13:05 +0100199
200YAFFS Tracing
201-------------