Machine Learning Authors: Zakia Bouachraoui, Liz McMillan, Roger Strukhoff, Pat Romanski, Carmen Gonzalez

Related Topics: Adobe Flex, PowerBuilder

Adobe Flex: Blog Post

JBIG2: How PDFs and Developers Benefit

JBIG2 Benefits

Ever since the PDF's payload began carrying images, file size quickly became an issue. As people began adding larger and larger images into their PDF it took longer to transmit these files and the space in which they were stored filled up faster. Luckily image compression technologies were never far behind to help alleviate these problems. One such technology is JBIG2, a bi-level (black/white) image compression format introduced by the Joint Bi-level Image Experts Group.

Unlike previous image compression formats (including its predecessor JBIG1) JBIG2 uses an intelligent algorithm to achieve its compression ratios. In short, the algorithm first searches for and recognizes similar groups of pixels within an image. Then, it creates symbols to represent common or repeated shapes it has found and stores them in a table. This lossless compression process does not affect image quality and the end results are documents that can be one quarter to one fifth of their original size.

To Whom Does JBIG2 Serve?
Government, judicial, and medical sectors are some examples of PDF-intensive work places where subtle implementations of the JBIG2 format in their work flows can help IT Managers reap noticeable cost saving benefits later on. Not only are PDF files that contain JBIG2 compressed information easier to send and share, but they are easier to store, they display rapidly online, and they are OCR ready.

The following table outlines some of JBIG2's main features:

JBIG2 Feature: Benefit: Example:
Higher compression rates than its predecessors (e.g., JBIG1, TIFF G3, and G4). File size reduction capabilities up to 90% or higher. Reduction in storage space and transmission bandwidth. With JBIG2 compression, a 78 MB uncompressed 500-page PDF document, would see its file size reduced to 12.7 MB. The equivalent TIFF files would be approx. 15.8 MB.
Lossy and lossless compression methods. Lossy yields a higher compression rate without any perceivable information loss. A pass to clean the document of dots and artifacts from a scanned document can help JBIG2 compression by coding more simple white areas
The use of symbol dictionaries. • For the compression of other images within the same document .
• Eventually one symbol dictionary could be used to recognize the text in the image. It contains the building blocks of a possible OCR procedure to help rebuild font information (if lost).
• Unique to one PDF document, a global JBIG2 stream can contain a dictionary of symbols used for all the pages of the document.
• Once the dictionary is built, software attempts to recognize letters and build legible text from them
The use of arithmetic and Huffman coding schemes for bit representation. Huffman coding takes less page memory and has faster compression and decompression than arithmetic coding. However, arithmetic compression is slower, uses more memory but yields better compression results. JBIG2 can support the Huffman and the arithmetic coding algorithms for image structure information such as encoding schemes, references, indexes, sizes, offsets, and popular symbol identities.
ITU-T T.6 facsimile coding schemes and coding control functions for Group 4 facsimile functionalities which is activated by a MMR (Modified Modified READ (Relative Element Access Designate)) flag. Use of the latest facsimile logic for the compression of building block images. Any image leaf can be coded using MMR logic. In addition, a symbol in a dictionary or whole page can be found in the JBIG2 stream as a MMR image.
Stripped-page compression. JBIG2 can compress uninterrupted image flows. Under specific circumstances, if a scanner sends image information without a page cut, a JBIG2 stream can still take the data and compress it.
Most PDF viewers support reading JBIG2 (ver. 1.4 and higher). JBIG2 technology can be easily integrated into the PDF's established technologies. Most of the PDF documents produced by high-end scanners with professional drivers are compressed with JBIG2 technologies.


JBIG2: Smaller Things are Easier to Handle
Amyuni Technologies has been carefully following the evolution of JBIG2 ever since the format became supported by PDF. Amyuni Technologies first included JBIG2 decoding (decompression) capabilities in their PDF Creator and PDF Converter products. Now, with these products' upcoming 4.5 releases, Amyuni Technologies extends their JBIG2 support to include its encoding (compression) capabilities in addition to OCR capabilities. Whether for PDF integration or publication and distribution purposes, end-users and developers will be able take advantage of JBIG2's powerful black and white compression capabilities.

More Stories By Amyuni Tech

Franc Gagnon is a technical copywriter for Amyuni Technologies–a PDF solution provider. Amyuni products are integrated into applications used worldwide. The company's software tools are the PDF engines behind several leading business applications created by large software companies such as Intuit, Sage, CaseWare and many more.

CloudEXPO Stories
The precious oil is extracted from the seeds of prickly pear cactus plant. After taking out the seeds from the fruits, they are adequately dried and then cold pressed to obtain the oil. Indeed, the prickly seed oil is quite expensive. Well, that is understandable when you consider the fact that the seeds are really tiny and each seed contain only about 5% of oil in it at most, plus the seeds are usually handpicked from the fruits. This means it will take tons of these seeds to produce just one bottle of the oil for commercial purpose. But from its medical properties to its culinary importance, skin lightening, moisturizing, and protection abilities, down to its extraordinary hair care properties, prickly seed oil has got lots of excellent rewards for anyone who pays the price.
The platform combines the strengths of Singtel's extensive, intelligent network capabilities with Microsoft's cloud expertise to create a unique solution that sets new standards for IoT applications," said Mr Diomedes Kastanis, Head of IoT at Singtel. "Our solution provides speed, transparency and flexibility, paving the way for a more pervasive use of IoT to accelerate enterprises' digitalisation efforts. AI-powered intelligent connectivity over Microsoft Azure will be the fastest connected path for IoT innovators to scale globally, and the smartest path to cross-device synergy in an instrumented, connected world.
There are many examples of disruption in consumer space – Uber disrupting the cab industry, Airbnb disrupting the hospitality industry and so on; but have you wondered who is disrupting support and operations? AISERA helps make businesses and customers successful by offering consumer-like user experience for support and operations. We have built the world’s first AI-driven IT / HR / Cloud / Customer Support and Operations solution.
ScaleMP is presenting at CloudEXPO 2019, held June 24-26 in Santa Clara, and we’d love to see you there. At the conference, we’ll demonstrate how ScaleMP is solving one of the most vexing challenges for cloud — memory cost and limit of scale — and how our innovative vSMP MemoryONE solution provides affordable larger server memory for the private and public cloud. Please visit us at Booth No. 519 to connect with our experts and learn more about vSMP MemoryONE and how it is already serving some of the world’s largest data centers. Click here to schedule a meeting with our experts and executives.
Darktrace is the world's leading AI company for cyber security. Created by mathematicians from the University of Cambridge, Darktrace's Enterprise Immune System is the first non-consumer application of machine learning to work at scale, across all network types, from physical, virtualized, and cloud, through to IoT and industrial control systems. Installed as a self-configuring cyber defense platform, Darktrace continuously learns what is ‘normal' for all devices and users, updating its understanding as the environment changes.