Data conversion is the conversion of one form of computer data to another–the changing of bits from being in one format to a different one, usually for the purpose of application interoperability or of capability of using new features. At the simplest level, data conversion can be exemplified by conversion of a text file from one character encoding to another. More complex conversions are those of office file formats, and conversions of image and audio file formats are an endeavor that is beyond the ken of ordinary computer users. Information can easily be discarded by the computer, but adding information takes effort.
For example, a truecolor image can easily be converted to grayscale, while the opposite conversion is a painstaking process. Converting a Unix text file to a Microsoft (DOS/Windows) text file involves adding information, but that addition is easily done with a computer, since it is rule-based; whereas the addition of color information to a grayscale image cannot be done programmatically, since only a human knows which colors are needed for each section of the picture, there are no rules that can be used to automate that process. Converting a 24-bit PNG to a 48-bit one does not add information to it, it only pads existing RGB pixel values with zeroes, so that a pixel with a value of FF C3 56, for example, becomes FF00 C300 5600.
The conversion makes it possible to change a pixel to have a value of, for instance, FF80 C340 56A0, but the conversion itself does not do that, only further manipulation of the image can. Converting an image or audio file in a lossy format (like JPEG or Vorbis) to a lossless (like PNG or FLAC) or uncompressed (like BMP or WAV) format only wastes space, since the same image with its loss of original information (the artifacts of lossy compression) becomes the target. A JPEG image can never be restored to the quality of the original lossless image from which it was made, no matter how much the user tries the JPEG Artifact Removal feature of his or her image manipulation program. The computer can add information only in a rule-based fashion; most users want additions of information that can only be accomplished by humans.
Data conversion can also suffer from inexactitude, the result of converting between formats that are conceptually different. The WYSIWYG paradigm, extant in word processors and desktop publishing applications, versus the structural-descriptive paradigm, found in SGML, XML and many applications derived therefrom, like HTML and MathML, is one example. Using a WYSIWYG HTML editor conflates the two paradigms, and the result is HTML files with suboptimal, if not nonstandard, code.
Successful data conversion requires thorough knowledge of the workings of both source and target formats. In the case where the specification of a format is unknown, reverse engineering will be needed to carry out conversion. Reverse engineering can achieve close approximation of the original specifications, but errors and missing features can still result. The binary format of Microsoft Office documents (DOC, XLS, PPT and the rest) is undocumented, and anyone who seeks interoperability with those formats needs to reverse-engineer them. Such efforts have so far been fairly successful, so that most Microsoft Word files open without any ill-effect in the competing Open Office org Writer, but the few that don’t, usually very complex ones, utilizing more obscure features of the DOC file format, serve to show the limits of reverse-engineering.
Use Lead Generation


