Tuesday, February 19, 2008

Large File Support in Linux for C/C++ operations

Overview

My current project deals with high definition movies that are encrypted and used for IPTV. The problem occurred when I tried to encrypt a real world 4.8GB movie. The application server (OC4J) crashed with the error:
File size limit exceeded$JAVA_HOME/bin/java $JVMARGS -jar $OC4J_JAR $CMDARGS

Limitations

In a C/C++ application the file operations can be handled using the fcntl.h header file. It provides operations for opening, writing to a file and many more. The size of every file is stored in a variable of type off_t. For 32-bit systems the maximum value for off_t is 231 thus limiting the maximal file size to 231 bytes (2 GiB). For 64 bit systems like x86-64 this maximum value is much greater and they have support for large files with size up to 263 bytes.

Prerequisites

The LFS support is done by the Linux kernel and the GNU C library (glibc) and is implemented since version 2.4.0 of the Linux kernel and glibc 2.2.3 (e.g. SuSE 7.2, Red Hat 7.1). The file system is also important - ext2/ext3 have full support for LFS.

OS Configuration

The current configuration of the OS should also be checked. All resource and process limitations can be examined and changed using the Linux command ulimit e.g.:

$ulimit -a

...

file size (blocks, -f) unlimited

The "file size" property shows the maximum size of a file in bytes that you can manipulate - probably a very large number or unlimited. If it's not use the same command to change it to any number (e.g. 5000000 bytes) or unlimited:
$ulimit -S -f unlimited
After that you can create a large file for a test. This creates around 5GB file:
$dd if=/dev/zero of=outputfile bs=1M count=5


Resolutions
  • Compile your programs with "gcc -D_FILE_OFFSET_BITS=64" or "g++ -D_FILE_OFFSET_BITS=64" for C++ code. This forces all file access calls to use the 64 bit variants. It's important to always use the correct data types and to not use e.g. int that is 32 bit instead of off_t (file size). For portability with other platforms you should use getconf LFS_CFLAGS which will return -D_FILE_OFFSET_BITS=64 on Linux platforms but might return something else on e.g. Solaris. For linking, you should use the link flags that are reported via getconf LFS_LDFLAGS. On Linux systems, you do not need special link flags.
  • Define _LARGEFILE_SOURCE and _LARGEFILE64_SOURCE. With these defines you can use the LFS functions like open64 directly.
  • Use the O_LARGEFILE flag with open to operate on large files.

I chose the first solution and changed the build script that compiles all components (to links them into a shared binary library later) :

g++  -D_FILE_OFFSET_BITS=64 ...

No flag is needed during the linkage phase. This approach do not require code changes!


Implemetation

A sample C/C++ implementation follows:
- Import required headers:
#include "fcntl.h"
- Open file with name outputFileName for create/write:

int hOutputMovie = open(outputFileName, O_CREAT | O_WRONLY, S_IRWXU);
if (-1 == hOutputMovie)
{
loggerVpr.log("Problem opening file %s", outputFileNameChars);
...
}

- Write to the file some buffer of data (unsigned char *buffer) with specified length (int len):
if (len != write(hOutputMovie, buffer, len))
{
loggerVpr.log("\nUnable to write data to file with fd %d", hOutputMovie);
return false;
}

References

fcntl.h
Suse OS
AIX OS