'cng2jpg'
Decoder Notes

A description of
a Tcl script that
converts 'cng' files
to JPEG files.


The 'Complete National Geographic'
(CNG) box of DVD's

Home > RefInfo menu >

Computer topics menu (Linux section) >

Linux Guides/Notes by Blaze menu >

This 'cng2jpg' Notes page

! Note !
More notes and images may be added ---
if/when I re-visit this page.

INTRODUCTION :

In 2011 Dec, while Christmas shopping, I ran across the 'Complete National Geographic' (called CNG for short).

This is a set of 6 DVD disks containing images of the pages of National Geographic Magazines published from 1888 to 2008.

The DVD's contain over 1,400 issues, 8,000 articles, and 200,000 photos.

The DVD's consist of '.cng' image files of the pages of the magazines.

At about 160 pages per magazine (on average), this means there are on the order of 160 x 1400 = 224,000 page images on the DVD's.

If you ignore the many advertising pages (about 40 or more per issue), there may be on the order of 150,000 actual article pages.

The 'reader' software in the collection --- Adobe software for reading the encrypted '.cng' files which contain images of the pages of the magazines --- is designed to run on Microsoft Windows or Mac OS X.

A Linux version of the 'Adobe Air' reader software is not shipped on the DVD's --- and according to reports, at some links at the bottom of this web page, 'Adobe Air' was not implemented well for Linux.

In fact, many comments on the web (see the links at the bottom of this web page), indicate the 'Adobe Air reader' was not implemented well on MS-Windows or Mac OS X either.

In doing some WEB SEARCHES on terms like

'cng file complete national geographic'

I found that even people who had installed the software on MS-Windows or Mac OS X were often left high and dry --- not being able to read the magazine pages, even after some had spent $200 for the collection.

In doing some searching on the nature of the '.cng' files, I found the following quote.

"The cng files are all jpegs, XOR'd bitwise with 239."


Devising a 'cng2jpg' converter :

Using the Gnome calculator, 'gcalctool', one can see that decimal 239 is hex 'EF' --- which is binary 11101111.

In looking at the top of a '.jpg' file, I saw that among the 'unreadable' binary bytes, in characters 7 through 10 of the first part of the typical JPEG file, is the 'human-readable' string 'JFIF'.

I brought up one of the '.cng' files, from one of the CNG DVD's, in the binary editor 'bless' (binary-less).

Sure enough, the eighth and tenth characters of the '.cng' file were identical hex codes ---- corresponding to the location of the two 'F' characters of JFIF.

But they were not the hex code for the letter 'F'.

The 'bless' editor has a feature that lets you pick a hex byte and do a logical operation on the byte (like an XOR operation, with a code like hex 'EF' or binary 11101111).

Sure enough, when I performed that operation on the 8th (and 10th) character of the '.cng' file, I got the character 'F'.

I knew that the Tcl script language supported 'bit-wise' operations, so I decided I would try to make a Tcl script that converted a '.cng' file to a '.jpg' file.

There aren't many good examples, in the various Tcl-Tk textbooks --- nor on the Internet, of doing an XOR operation on byte codes (whether hex or binary or decimal), but I was able to devise a 'cng2jpg.tcl' script.

    (This is a link to a copy of the source. 'Left-click' to display the source. 'Right-click' to download it.)

After the script 'cng2jpg.tcl' opens the input and output files, the following 5 Tcl statements do the main work.


## NOTE:
## We are reading one byte at a time here.

set BYTEin [read $f1 1]

## In 'binary scan', 'c' indicates
## we want the new variable CHARin
## to be 'typed' as an 8-bit character code.

binary scan "$BYTEin" c CHARin

## TRANSLATION OF THE BYTE IS HERE ---
## XOR with hex 'EF' = decimal 239 for each byte.

set XORout [expr { $CHARin ^ 239 }]

## Making sure the byte is in binary form,
## for writing out.

set BYTEout [binary format c $XORout]

## We write out one byte at a time here.

puts -nonewline $f2 $BYTEout

Speed of converting each '.cng' file :

The speed of the conversion process might be improved by seeing if we can eliminate or combine some of these 5 statements.

    (Someday, I may see if I can eliminate the 'binary scan' or the 'binary format' commands --- and perhaps eliminate the intermediate variables like 'CHARin', 'XORout', and 'BYTEout'.

    But I like the almost-self-documenting readability of the five statement approach.)

The comments in the script point out that ... although we appear to be reading-from and writing-to the two disk files a byte at a time, the reading and writing really goes quite fast, because ...

Underneath it all, the file 'reads' are actually buffered.

So the reading of the CNG file should go really fast, in blocks.

We are actually fetching a byte at a time from a cache (in-memory) input buffer, which should go very fast.

And ...

As with reading, underneath it all, the file 'writes' are actually buffered and are done in blocks.

We are actually putting a byte at a time into a cache (in-memory) output buffer.

The actual block-writes to the resulting JPEG file should go very fast --- each block being many hundreds of bytes long.

In fact, the large-sized (not-a-thumbnail-image) '.cng' files (and the corresponding output JPEG files of exactly the same size) are on the order of 60 to 450 Kilobytes in size --- depending mainly on the variety of colors and the nature of the color changes in the page image.

In pixels instead of bytes:

The JPEG files are generally about 1340x2000 pixels in size.

The height, 2000, is generally constant, but the width varies from about 1320 to 1350.

    Note that these 1340x2000 images are about 2.6 Megapixel images, whereas today's (2011) digital cameras take pictures of resolution 6 Megapixels or even 14 Megapixels.

    So these page images are about the resolution (that is, picture quality) of very low-end digital cameras.

    There is more on the (poor) image quality of the files below.

This translator script converts each CNG file to a JPEG within a few seconds --- on the order of 50 to 100 Kilobytes per second --- on the typical 2011-era mid-range PC computer.

To make it easy to apply that script to many '.cng' files in a directory, I made a 'wrapper shell script' to call on the 'cng2jpg.tcl' script for each '.cng' file selected in a month/issue directory.

(This link provides a copy of the 'multiFile' wrapper-script source. 'Left-click' to display it . . . 'right-click' to download it.)

The nature of the directories of '.cng' files is revealed in images below --- images of the directory names and file names, as shown in a GUI file manager.

This 'wrapper script' is suited to being implemented as a Nautilus (or Caja) script --- where Nautilus is the GUI file manager available on Linux GNOME desktop-environment systems --- and Caja is the GUI file manager available on Linux MATE (pronounced mah-tay) desktop-environment systems.


Magazine Issue Directories
('month-or-issue directories') :

(converting multi-months of files)

It turns out that, even with the 'multiFile' cng2jpg wrapper script, it is tedious to convert the '.cng' files to '.jpg' files because the '.cng' files are in about 1,400 month-or-issue directories --- with about 100 to 250 '.cng' page-image files in each month/issue-directory --- for a total of about 225,000 '.cng' (magazine page) files.

That 'multiFile' script, which is suited to being applied to one directory of files at a time (about 100 to 250 files), would need to be applied to 1,400 month/issue directories --- a tedious process.

Before going on, let us pause to look at the example images of the directories on the Disk1 and Disk3 DVD's below, to see how the files are organized on the DVD's.

For the following images:

Click on an image to see the image in a separate window or tab.


The image above shows CNG Disk 1 top level directories.

A couple of 'decades directories' are under the
the 'disk1/images' directory path.

The month/issue-directories are under the
'199x' and '200x' decades-directories ---
which contain 1995 through 2008 month/issue-directories.


The image above shows CNG Disk 1 month/issue directories
under the 'disk1/images/200x' directory path.

Above we see the twelve year-2000 month/issue-directories ---
and the start of the year-2001 month/issue directories.


The image above shows CNG Disk 1 CNG files in
a month/issue directory --- '20010101'.

Above we see the start of the page image files
--- '.cng' files --- for month 2000_01 = Jan 2000
--- pages '001', '002', '003', etc.

The prefix 'NGM' apparently stands for
'National Geographic Magazine'.


The image above shows CNG Disk 3 top level directories.

Above we see that Disk 3 does not have the Mac 'osx'
installer directory, nor the Microsoft 'windows'
installer files. They were only on Disk 1.

There is just the 'disk3/images' directory path,
with the '196x' and '197x' decades-subdirectories.


The image above shows CNG Disk 3 month/issue directories.

Above we see the start of the 197x 'month/issue-directories'
--- for months 1970 Jan, 1970 Feb, etc. --- into 1971.

Copying the files from the DVD :

If one is copying the '.cng' files to a USB disk drive or 'pen' drive(s) --- for preservation and conversion-to-JPEGs and various kinds of image processing --- about 40 Gigabytes of '.cng' files on the 6 DVD's --- it seems handy to keep the '.cng' files (and the translated '.jpg' files) in month/issue directories with the same names as on the DVD's.

Example month/issue-directory names for the year 1964:

  • 19640101
  • 19640201
  • 19640301
  • ...
  • 19641201

    They could have dropped the '01' on the end, but it was probably left there to indicate that the magazine was intended to be distributed near the beginning of the indicated month, rather than near the end.

    Furthermore ...

    They may have wanted to allow for including images of pages from National Geographic 'special issues', which might have caused there to be more than one magazine issue in a month.

To make a long story shorter, I eventually managed to copy the '.cng' files onto about 4 USB pen drives.

(They would probably be safer on an 'external' USB disk drive, because I have had problems with pen drives/sticks becoming corrupted.)

So now the task was to convert about 225,000 '.cng' files to '.jpg' files.

To be able to simply select a bunch of month/issue-directories in the Nautilus file manager and then choose a script to run to convert the 100-plus '.cng' files in each of the selected directories, I made another 'wrapper' shell script that calls on the 2 scripts above to do the 'bulk' conversion.

(This is a link to a copy of the 'multiDir' script source. 'Left-click' to display the script . . . 'right-click' to download it.)

To summarize, I have

  • a 'multiDir' wrapper shell-script that can be applied to multiple month/issue directories, and that script calls:

  • a 'multiFile' wrapper shell-script that can be applied to a single month/issue directory, to convert the 100 to 250 '.cng' files there to '.jpg' files, and that script calls:

  • the 'cng2jpg' Tcl script that converts a single '.cng' file to a '.jpg' file.

So now, after I copy many National-Geographic-years of '.cng' files to a USB disk drive --- or to multiple USB 'pen' drives -- I am ready to convert massive numbers of '.cng' files to '.jpg' files.


Time required for
(massive) copying and conversion :

Note that both

  • the copying from DVD's to USB drive(s), and

  • the cng-to-jpg conversion process

will take many hours to perform.

It typically takes on the order of an hour to copy all the month/issue-directories (about 7.7 Gigabytes) from one of the DVD's to a USB drive --- so on the order of 6 hours to copy the approximately 225,000 '.cng' files from all 6 DVD's.

    I wanted to copy files from the DVD's to a removable USB drive (or drives) rather than to the hard drive on my desktop or netbook computers --- because of the huge number of space-gobbling files --- many of which I may never look at.

    I did not want to put 40 Gigabytes of such files on any of my 'internal' hard disk drives.

The conversion of each magazine's '.cng' files (about 160 page images per magazine issue, on average) takes about 3 secs/page x 160 pages = 480 secs = 8 minutes per magazine-issue, for about 1,400 magazine issues.

So the entire cng-to-jpg conversion process may take about 8 x 1,400 = 11,200 minutes = 186 hours --- on the order of 7 days.

(By using a computer with a high-end processing chip, the time may be reduced to about 1 to 2 days.)


The POOR image quality :

That's a big investment in time for images

  • many of which are ads (for cars, drugs, cameras, dog food, etc. --- about 30 ad pages at the front and about 10 ad pages at the back, of each issue), and

  • many of which were scanned at unbelievably poor quality.

Here are some examples of the poor quality :

 

The first (or left) image is a reduced size image of a NatGeo magazine page containing a painting of a Mars landscape.

The second (or right) image is a full-sized image (the actual scan or delivered/sold image) of a portion of the sky --- clipped from the 1311x2000 pixel full-size image of the National Geographic magazine page.

Note the blotchy patches of color, rather than a smooth color gradient.

I can see why many people feel that they were 'ripped off'.

I ran across a picture of a fox whose fur, on moderately close examination (of the full-sized image, about 1300x2000 pixels), looked like the hairs were matted over a mesh.

The mesh showed through the fur in spots.

Strange!

Surely there was not an actual fabric mesh on the fox.

A fox with a toupee???

More likely, a strange form of image dithering???

Another example:

I ran across a picture of the Golden Gate bridge shrouded in a low fog, a common occurrence in the Bay Area.

At the top of the picture was clear blue sky, and on moderately close examination (of the full-sized image, about 1300x2000 pixels), the sky was made up of vertical stripes, of blue and light-blue.

Another strange form of image dithering???


Images crossing multiple pages :

After converting a couple of years of '.cng' files and browsing through the resulting '.jpg' files, I noticed that some images were separated across two adjacent pages.

I did some WEB SEARCHES on terms like

'imagemagick convert merge images'

'imagemagick convert join images'

'imagemagick convert append images'

and found that I could paste two image files together 'horizontally' (side-by-side, rather than over-under) by using the 'convert +append' command.

So that I would be able to simply select a pair (or more) of image files in the Nautilus file manager and then choose a script to run to 'append' the selected '.jpg' files together (horizontally or vertically), I made a 'multiAPPEND' script (link to the 'multiAPPEND' script source here) that makes a new '.jpg' file, in the same directory as the selected '.jpg' files.

It worked like a charm on the first test that I chose --- a picture of a border collie spread across 2 pages.

However, on some subsequent images, I found it did not work so well, because, apparently, many of the page pairs were not carefully scanned, and hence do not match up well.


What next?
(viewing these images on Linux,
processing them, etc.)

So now I am prepared to copy many 'month/issue-directories' of '.cng' files from the DVD's to an external USB disk drive --- or to USB sticks --- and convert many or all of the '.cng' files to '.jpg' files.

Then, instead of using the 'Adobe Air' reader (which has implementation problems and super-frustrating performance problems, as noted in external web links referenced at the bottom of this page), I can use an image viewer on Linux (the PC operating system that I use) to browse through the '.jpg' files.

For example, I can use the 'eog' (Eye of Gnome) image viewer to quickly 'page through' the image files in a directory --- quickly skipping over the many advertising images in the month/issue-directory.

    (I could then delete the ad images or move them to a sub-directory of that month/issue-directory.)

Note that many images (photos, maps, graphs, or whatever) occupy a portion of a magazine page.

So those images occupy a portion (sub-rectangle) of each '.jpg' file.

To extract an image (such as a map image), I plan to use an image editor (such as 'mtpaint' on Linux) to extract the image as a separate image file.

    (And I may find it better to save the extracted image as a PNG file, rather than a JPEG file.)


Unfortunately, as noted above with some sample images and as noted in the first external web link in the 'EXTERNAL LINKS' section below, the people who assembled the CNG DVD's did not scan the magazine pages at very high definition --- and they did not capture the colors very well.

So the quality of the images provided on these DVD's is nowhere near the quality seen on the printed magazine pages.

Apparently, instead of using a minimum of 300 dots (or pixels) per inch --- low-end ink-jet printer resolution, they decided publish the images with resolution of about 200 pixels per inch.

    A National Geographic page is 10 inches by 7 inches (or actually 10 inches by about 6 and 7/8 inches) --- about 25.4 centimeters by 17.46 centimeters.

    At least, that was the page size in a 1994 magazine issue.

    So the vertical resolution of the page images is 2000 pixels / 10 inches = 200 pixels per inch.

    The horizontal resolution is about 1340 pixels / 6.875 inches = about 195 pixels per inch.

What is even worse --- they seem to have used a poor JPEG compression scheme that 'dithered' the images horribly.

Hopefully National Geographic has the magazine page images (and photos, maps, etc.) digitally stored at a higher resolution (and not loss-compressed and not dithered).

In particular, hopefully, for posterity, National Geographic will publish the images (or at least the best photos and maps) at a higher resolution (and a good compression scheme --- if any compression is used) someday.

And hopefully they will store the text someday as ASCII 8-bit characters, not just as text images embedded in scanned page images.

In any case, at least I will be able to preserve my investment in this DVD collection in a readable form --- relatively immune to future operating system changes and immune to the fact that 'Adobe Air' (the reader) will probably not run in future versions of the Microsoft, Mac, and Linux OSes.


Text search of the magazine articles :

Someday I could even try to implement, on Linux, the text search facility provided with the CNG DVD's (as outlined in one of the links below).

But I am not highly motivated to do so, because, as some have pointed out, the text search facility is pretty lame.

Apparently someone just chose some keywords for many of the pages, and created a cross-reference (via an SQL database) between those keywords and the magazine pages.

It is nowhere near a complete database of all the text in the magazine articles.


A tedious task remains :
(to be documented here?)

I may slowly accumulate the 'good' pages --- and better quality images --- from among all the advertising pages of the magazine issues.

But this 'gleaning process' is going to be a long, tedious task -- and I have many better things to do.

If I make some progress in determining a way of preserving (that is, organizing) my investment in these DVD's (their image files), then I may document that progress here at some future date.

Since I find maps quite appealing, and those images may not be too garbled by the poor quality of the published images, I may try to collect many of the map images together, in some sort of organized form --- such as in a set of web pages --- with the maps organized in some way.

(By age? By region of the world? By keywords?).

That would be a big project in itself --- especially with my motivation flagging after seeing the poor (badly dithered) quality of the map images.


Intent of this page :

This page is mainly meant as a personal page --- to remind me of how I can preserve my investment in these DVD's.

That is, these scripts make it possible for me to access the page images even though the Adobe Air viewer is, reportedly, a pain to get running on Linux --- and an EXTREME-(performance)-pain to use on Linux.

In fact, people have had 'show-stopping' problems with the viewer on various versions of the Microsoft Windows and Mac operating systems, indicating that the viewer will probably not be runnable on the computers of the coming years.

If anyone has found that they cannot access their copy of the DVD's using the viewing software delivered on the DVD's, AND, if they stumble across this web page, they may want to try implementing these scripts (or these techniques) on their operating system, to make at least some of their CNG files viewable.

NOTE that this page is not intended to make available a copy of the JPEG files translated from the CNG files.

A person should buy a copy of the DVD's before using these scripts.

But, because of the poor job 'someone' did in scanning the page images (apparently 'dithering' the scans in some horrible way), most people (if they know that fact) would not be motivated to buy these images, as indicated by disappointment (and rage) seen in links below.

I am at the end of my documentation of this (mis)adventure.

My thanks to 'Bob' and his 'XOR' with '239' hint.

    (Bob, wherever you are, I would like to know how you determined that.)

Some links to more information on the National Geographic files follow.

SOME EXTERNAL LINKS on 'cng' files
and the CNG package :

  • I found a 2010april post at a blog of 'Andrei Barbu' (of Purdue U.) --- blog title: "The Complete National Geographic".

    In that blog post, Andrei noted that several people complained that they can no longer view their files because of operating system changes that are incompatible with the 'Adobe Air' reader software.

    Andrei, in his post, points out :

      "My main complaint is that they used jpeg to compress the images and did so at really high compression leading to lots of artifacts.

      Looking at a page that has a black background can be painful.

      Which is a bit of a shame as one of the greatest things about NGM [National Geographic Magazine] is the high quality of the photographs and printing.

      The viewer [Adobe Air] also has way too many needless animations and there's sadly no way to disable them.

      So I wouldn't mind knowing the file formats involved ... more importantly a viewer that just does its job and gets out of the way would be great."

    [This latter 'viewer' statement is what 'Bob' was responding to, with his 'hack up a viewer' statement below.]

    Andrei also says :

      "Note that, as with most Adobe technologies on Linux, it [Adobe Air] is needlessly CPU intensive, it pegs my CPU at 100% ...".

    So you see several reasons for having a way to backup and view the cng files --- using a viewer other than the 'flaky' and extremely slow Adobe viewing system provided.

    Note that 'Bob', in a 2011 October response to this posting, gives the hint :

      "The cng files are all jpegs, XOR'd bitwise with 239. If anyone wants to hack up a viewer, feel free."

  • See blog Notes of 'Gordon' - dated March 2007, which express similar frustrations:

      "It's exasperating to think that all that historical data, all those articles, all those photographs, are sitting on my shelf and cannot be viewed with today's operating system.

      (The Reader/Searcher looked to my naive eye like a kissing cousin to [Adobe] Acrobat.)

      If only the source code for the Complete National Geographic CD-ROM set were available and could be updated to run natively on OS-X and other contemporary platforms ...

      Let's learn from this. Don't invest in products dependent on closed source solutions."

  • A 2009 October page at blogs.nationalgeographic.com indicated that others have bought the product and were not able to install it on their computer.

    Here's a quote from that page indicating more problems with the way the 'Complete National Geographic' was implemented :

      "@ NatGeo: Shame on you for selling such a bloated, slow, crapulous piece of software. I can't believe I paid $200 for this."

    And :

      "vince said: This product is not ready for prime time (i.e. should not be sold to anyone in its present form). The time to move from page to page using the DVDs is ridiculously slow. It is virtually unuseable in its present form."

    In addition, there are many postings by people saying they can't install the 'Adobe Air' reader software on their operating system --- or the reader quit working after a reader upgrade or an operating system upgrade.

  • A 'cng.htm' page at 'bilbo.online.bg' (in 2011) gave some insight on how the (rather limited) search facility in the Complete National Geographic (CNG) can be implemented based on an SQL database.

    That page may be dead, but here is a backup copy --- preserved here since the Google-cached page will probably disappear sometime around 2012.

  • A thread at ubuntuforums.org documented (in 2010) some attempts --- successful and not --- to get the 'Adobe Air' CNG reader running on Linux.

  • There was an ad (in 2011) for The Complete National Geographic, at amazon.com --- list price $79.99.

    (Every issue since 1988 --- on six DVD's --- over 1,400 issues, 8,000 articles, and 200,000 photos.)

  • In 2011, there was (very little) info on the cng file extension, at 'file-extensions.org'

    The web page said :

    "Currently there is no specific information available, about how to convert this file extension."

    But the scripts on THIS web page changes that.

  • In 2011, there was a query titled
    'Hacking Complete National Geographic' at
    www.linuxidx.com/linux.php/?q=Hacking+Complete+National+Geographic

    That information may no longer be available, but you can try a WEB SEARCH on keywords such as

    'hacking complete national geographic'

  • A WEB SEARCH on keywords such as 'cng file convert national geographic' was used to find URL's like those above.

    You could try adding other keywords such as 'jpg' or 'jpeg' or 'xor' or '239'.

Bottom of this web page on
'cng2jpg' - Notes on a Tcl script
to convert 'cng' files to JPEG files
.

To return to a previously visited web page location, click on the Back button of your web browser, a sufficient number of times. OR, use the History-list option of your web browser.
OR ...

< Go to TOP of this page, above. >

< Go to Top of External Links, above. >

Page history:

Page was created 2011 Dec 23.

Page was changed 2011 Dec 31.

Page was changed 2019 Jan 05.
(Added css and javascript to try to handle text-size for smartphones, esp. in portrait orientation. Changed some links.)

Page was changed 2019 Jul 15.
(Specified image widths in percents to size the images according to width of the browser window. Some minor reformatting of paragraphs. Added some links.)