Jpeginsert, a program to insert arbitrary data into JPEG files.

About

Jpeginsert is an open source program for Linux/Unix by Karel 'Clock' Kulhavý from Twibright Labs. Jpeginsert is written in the C programming language and has only 1 source file and a makefile. Jpeginsert inserts data of unlimited length without using steganography (subtle change in picture pixels) and without using EXIF (tags to carry additional information).

Requirements

  1. Linux. Other Unixes like BSD are like to work too, but the more difference, the higher probability the installation process will throw an error. Windows may work under Cygwin or Windows Subsystem for Linux. Android may work under Termux. Apple macOS may work as well.
  2. GNU MP Bignum Library (GMP) library installed on your system ready for static linking.

    How to test:

    1. Whether GMP is installed for dynamic linking: find /lib /usr/lib -iname 'libgmp.so' should find a file.
    2. Whether GMP is installed for static linking: find /lib /usr/lib -name 'libgmp.a' should find a file.

    If not found, install statically linked GMP library by these commands. If your system still doesn't support static linking, remove -static from the Makefile later during compilation.

    1. Debian, Ubuntu, Windows Subsystem for Linux (default Ubuntu choice): package name libgmp-dev
    2. OpenSUSE, Rocky, AlmaLinux, Solus, Void: package name gmp-devel
    3. Centos, Fedora, Amazon Linux: package names gmp-devel, gmp-static
    4. Cygwin (under Windows), OpenMandriva, ALT Linux P10 and Sisyphus: package name libgmp-devel
    5. ALT Linux P9: package names libgmp-devel, libgmp-devel-static
    6. Alpine, Adélie Linux: package name gmp-dev
    7. PCLinuxOS, Mageia: package name lib64gmp-devel
    8. Slackware, Arch, FreeBSD, NetBSD, KaOS: package name gmp
    9. OpenWrt: package name libgmp10
    10. Windows Subsystem for Linux (other choice than the default Ubuntu): as above, according to the selected distribution
  3. Working C compiler (GCC, Clang). Running cc should print an error different from "cc: command not found". If it doesn't work, install by:
    1. Debian, Ubuntu, Windows Subsystem for Linux (default Ubuntu choice): package name build-essential
    2. PCLinuxOS: package name build-essentials
    3. OpenSUSE: package name devel_basis
    4. Cygwin (under Windows), CentOS, Fedora: package name gcc
    5. Amazon Linux: command sudo yum groupinstall "Development Tools"
    6. Alpine Linux: package name build-base
    7. Windows Subsystem for Linux (other choice than the default Ubuntu): as above, according to the selected distribution
    8. Linux from Scratch: if present, change disable-static to enable-static in the instructions, example: LFS 6.9.11 GMP 6.2.0,
    9. From sources by the GMP vendor, BLFS 5.1 GMP 4.1.3.
  4. GNU Make installed. make --version should say "GNU Make". Other types of the Make program may work as well.
    1. FreeBSD, NetBSD: package name gmake
    2. Other systems: package name: make
  5. Optional: MPV video player (if you want to play videos or audio directly from JPEG files). You can use mplayr, VLC etc. but then you have to adapt the command inside the *.sh scripts yourself. Package name: mpv
  6. Bash commandline shell on /bin/bash. Normally is pre-installed except on OpenBSD, FreeBSD, NetBSD.

Secure Download

  1. Over a trustworthy (HTTPS) connection (no risk of man in the middle attack and insertion of malware), download the jpeginsert_20220817T03.tgz archive file from Internet Archive. If you get an error about certificates or being snooped upon, someone is trying to hack your computer by a man-in-the-middle attack and infecting the files on the way, or Internet Archive screwed up renewal or configuration of HTTPS certificates.
  2. Check that the files are not tampered using one of these two commands. The first is more trustworthy. However, this page is unencrypted, if it's intercepted and modified on the way, the sums in this page may be manipulated to match manipulated sums in the file, that's why the HTTPS before was necessary.
    sha256sum jpeginsert_20220817T03.tgz
    md5sum jpeginsert_20220817T03.tgz
    
    Results should be as follows. If they are wrong, someone is trying to hack your computer by giving you tampered infected files, or I made a mistake while updating these numbers.
    SHA-256: 587845ebf1f2abb3161a031370118e57bdd2b001e93a4ce104757b0af712e379  jpeginsert_20220817T03.tgz
    MD5: 1425a838d246b60b8f0090e2a8ad4c7b  jpeginsert_20220817T03.tgz
    

Installation

  1. Make sure you fulfilled the requirements as above
  2. Unpack the archive (e. g. tar xvf jpeginsert_20220817T03.tgz). It will create the directory jpeginsert_20220817T03.
  3. change directory into the newly created directory: cd jpeginsert_20220817T03
  4. If you couldn't install static GMP library, only dynamic one, remove -static from the Makefile.
  5. make jpeginsert
  6. sudo make install

Testing that the installation works

Run the following commands and they should print messages similar to these:

Command to runShould print something like this
jpeginsert
Usage: jpeginsert i carrier_picture.jpg - inserts data from stdin into the carrier_picture.jpg and 
                                          outputs the resulting enlarged .jpg onto the stdout
       jpeginsert x - takes .jpg that contains data inside, on stdin, extracts the data and outputs the data on stdout
jpeginsert_play_file.sh
Usage: /usr/bin/jpeginsert_play_file.sh JPEG_with_embedded_multimedia_to_play.jpg
jpeginsert_play_music_file.sh
Usage: /usr/bin/jpeginsert_play_music_file.sh JPEG_with_embedded_multimedia_to_play.jpg
jpeginsert_play_url.sh
Usage: /usr/bin/jpeginsert_play_url.sh URL_with_JPEG_with_embedded_multimedia_to_play

Usage

Inserting data into a JPEG

  1. Take a plain JPEG picture carrier.jpg into which nothing has been inserted yet
  2. Let's say we want to insert a movie movie.mkv into carrier.jpg and create a big bloated JPEG file bloated.jpg
  3. jpeginsert i carrier.jpg < movie.mkv > bloated.jpg
  4. You can verify that bloated.jpg still shows the original picture by e.g. gmic bloated.jpg. It may take a long time. It will not take more RAM.

Extracting data from a JPEG

  1. We have bloated.jpg from the previous step (insertion) and want to see the movie inside it.
  2. If we have the MPV video player installed, we do jpeginsert_play_file.sh bloated.jpg and the movie will play directly
  3. We can also just extract the movie into a file jpeginsert x < bloated.jpg > movie_extracted.mkv
  4. If we have MPV and the original file was music, we use jpeginsert_play_music_file.sh bloated.jpg and the picture will show as an artwork, the music playing at the same time.
  5. If the bloated.jpg is on the Internet, we don't need to download it, we can directly play it with jpeginsert_play_url.sh
  6. These 3 *.sh scripts can be edited in /usr/bin to accomodate your favorite movie player (mplayer, VLC, etc.) and their options.

Example JPEG files with content already inserted

Warning: some of these files are huge, they may incur Internet fees, slow down your Internet, hang up your program or consume all the RAM and crash your computer (depending on the quality of the software which looks at them)

  1. 3.8 MB music Urban Lullaby: 1. big JPEG, 2. description page - may waste Internet!, 3. file listing.
  2. 529 MB video Famous Unsolved Codes: 1. big JPEG, 2. description page - may waste Internet!, 3. file listing.
  3. 818 MB Paris drone video: 1. big JPEG, 2. description page - may waste Internet!, 3. file listing.
  4. 1.2 GB New York drone video: 1. big JPEG, 2. description page - may waste Internet!, 3. file listing.
  5. 2.47 GB video Plainly Difficult year 4: industrial catastrophes: 1. big JPEG, 2. description page - may waste Internet!, 3. file listing.
  6. 4 GB video Plainly Difficult year 5: industrial catastrophes: 1. big JPEG, 2. description page - may waste Internet!, 3. file listing.
  7. 31 GB video US Senate proceeding: 1. big JPEG, 2. description page - may waste Internet!, 3. file listing.

How does it work?

The original JPEG standard from 1992 prescribes quantization tables which have to be in every image, otherwise it cannot be displayed. There are 4 tables, slots 0-3. These quantization tables can be defined in blocks of several quantization tables and each of these tables in the block can go into a slot 0-3. The standard allows to redefine a slot over and over. So we redefine these tables over and over with nonsensical data which are the data we want to carry. At the end, we insert the original quantization tables, which will overwrite the nonsensical data and the JPEG will work as normally.

The coefficients are prohibited from being 0 so out of the 256 combinations 0-255 we have only 255 available: 1-255. So we need to do some heavy arithmetic using the GMP library to recalculate base 256 information into base 255 information.

The process is fast and highly efficient, the overhead is only few %. The algorithm has been heavily manually optimized for speed.

What does it do to the JPEG files?

It adds tons of information which has the form of quantization tables which have to be present in the JPEG anyway, otherwise it cannot be displayed. It doesn't change the pixels and also doesn't add EXIF tags which are for additional data.

Can I make an infinitely big JPEG?

Yes, you can make an infinitely big JPEG. Of course it cannot be stored on a hard disk or a server, it can be only made on the fly in a pipe between programs, as dynamically generated web content from a CGI script, streaming etc.

When can the information get lost?

Some softwares (and smartphones) sometimes strip EXIF metadata, but they cannot blindly strip quantization tables, otherwise the JPEG couldn't be viewed at all. These softwares will not destroy the data. Jpeginsert doesn't use EXIF metadata to store information.

However other software like Facebook decodes the JPEG, changes size, compression quality and reencodes. This will delete the information (e. g. video file, audio file) embedded inside. This process will make the JPEG small again.

Lossless rotation program Jpegtran deletes the information, even when you specify -perfect and -copy all

Contact

For email, IRC chat and web IRC chat see Twibright Labs contact to Karel Kulhavy