1. Attachments are working again! Check out this thread for more details and to report any other bugs.

Toyota Electronic Parts Catalog: Format of IMAGEF.DAT

Discussion in 'Knowledge Base Articles Discussion' started by Elektroingenieur, Aug 10, 2019.

  1. Elektroingenieur

    Elektroingenieur Senior Member

    Joined:
    Jan 8, 2017
    2,515
    3,253
    9
    Location:
    California
    Vehicle:
    2016 Prius
    Model:
    Three Touring
    Users of Toyota’s Electronic Parts Catalog (EPC) system, and its derivatives, such as parts.toyota.com, may have noticed the fax-like quality of the illustrations. This isn’t a coincidence: the parts drawings are indeed delivered in a fax image format, presumably because it provides acceptable quality and a good compression ratio.

    When Toyota’s Service Parts Engineering Administration Division publishes the EPC on DVD-ROM discs, the illustrations are stored together in a large file called IMAGEF.DAT. I describe its format here, for readers who may have these DVDs (sometimes available from Yahoo! Auctions sellers in Japan) but find it inconvenient to use the included viewing application, which works only with Microsoft Windows.

    IMAGEF.DAT contains a sequence of illustrations, each with a header, image data, references list, and padding.

    The header is 21 bytes long:
    • Bytes 0–2: Size of the illustration in bytes, excluding the padding
    • Byte 3: Unknown; always 0x33 or 0x34
    • Bytes 4–10: Illustration identifier (the letters and digits usually seen in the lower right corners of the images)
    • Bytes 11–12: Image width in pixels
    • Bytes 13–14: Image height in pixels
    • Byte 15: Unknown; always 0x33
    • Bytes 16–18: Size of the image data in bytes
    • Bytes 19–20: Size of the references list in bytes
    The header is followed by the image data, stored in the Group 4 compressed format for black-and-white images defined in ITU-T Recommendation T.6, Facsimile coding schemes and coding control functions for Group 4 facsimile apparatus. Pad bits are used to fill the image data to a byte boundary.

    This image coding scheme is supported natively by the PDF (as “CCITTFaxDecode”) and TIFF (as “T6-encoding”) file formats. To make a TIFF file from the image data, for example, prepend the following 130 bytes, filling in the image width (wwww), image height (hhhh, in two places), and byte size of the image data (ss ssss) where indicated:

    4D4D 002A 0000 0008 0009 0100 0003 0000 0001 wwww 0000 0101
    0003 0000 0001 hhhh 0000 0103 0003 0000 0001 0004 0000 0106
    0003 0000 0001 0000 0000 0111 0004 0000 0001 0000 0082 0116
    0003 0000 0001 hhhh 0000 0117 0004 0000 0001 00ss ssss 011A
    0005 0000 0001 0000 007A 011B 0005 0000 0001 0000 007A 0000
    0000 0000 0048 0000 0001
    ...image data follows...

    The image data is followed by the references list, a series of references to other figures, part name codes, or standard part numbers shown in the illustration. Each reference is 22 bytes long:
    • Bytes 0–1: Upper left X coordinate of image hotspot
    • Bytes 2–3: Upper left Y coordinate of image hotspot
    • Bytes 4–5: Lower right X coordinate of image hotspot
    • Bytes 6–7: Lower right Y coordinate of image hotspot
    • Byte 8: Referent type: a figure (0x31), a part name code (0x33), or a standard part number (0x34)
    • Byte 9: Referent length: 0 (0x00), 4 (0x04), 5 (0x05), 6 (0x06), 10 (0x0A), or 12 (0x0C) characters
    • Bytes 10–21: Referent
    In the header and references list, the byte sizes, width, height, and coordinates are big-endian, unsigned integers. The illustration identifiers and referents are strings of eight-bit characters, using the ISO 646 code points for 0 (0x30) through 9 (0x39) and A (0x41) through Z (0x5A), filled to the field length with spaces (0x20).

    The references list is followed by the padding, consisting of zero or more zero bytes (0x00), which usually—but not always!—bring the total size of the illustration to a multiple of 2048 bytes—after which the next illustration begins. I’m not sure why the padding is used, since it makes the files about 5 to 8 percent larger, nor why it is sometimes omitted.

    IMAGEF.DAT is accompanied by two or three index files, IMAGEF.IA0, IMAGEF.IA1, and IMAGEF.IA2. IMAGEF.IA0 contains a 15-byte index entry for each illustration in IMAGEF.DAT:
    • Bytes 0–6: Illustration identifier
    • Bytes 7–10: Byte offset of the illustration in IMAGEF.DAT, taking the first byte of that file as 1, not 0
    • Bytes 11–14: Size of the illustration in bytes, including the padding
    The byte offset and byte size are little-endian, unsigned integers. I don’t know why the byte order is different for the index. IMAGEF.IA1 and IMAGEF.IA2 (if present) are index files to IMAGEF.IA0 and are not described further.

    Interpreting the other catalog data files on the DVDs is left as an exercise for the reader.