Tuesday, February 19, 2008

Large File Support in Linux for C/C++ operations

Overview

My current project deals with high definition movies that are encrypted and used for IPTV. The problem occurred when I tried to encrypt a real world 4.8GB movie. The application server (OC4J) crashed with the error:
File size limit exceeded$JAVA_HOME/bin/java $JVMARGS -jar $OC4J_JAR $CMDARGS

Limitations

In a C/C++ application the file operations can be handled using the fcntl.h header file. It provides operations for opening, writing to a file and many more. The size of every file is stored in a variable of type off_t. For 32-bit systems the maximum value for off_t is 231 thus limiting the maximal file size to 231 bytes (2 GiB). For 64 bit systems like x86-64 this maximum value is much greater and they have support for large files with size up to 263 bytes.

Prerequisites

The LFS support is done by the Linux kernel and the GNU C library (glibc) and is implemented since version 2.4.0 of the Linux kernel and glibc 2.2.3 (e.g. SuSE 7.2, Red Hat 7.1). The file system is also important - ext2/ext3 have full support for LFS.

OS Configuration

The current configuration of the OS should also be checked. All resource and process limitations can be examined and changed using the Linux command ulimit e.g.:

$ulimit -a

...

file size (blocks, -f) unlimited

The "file size" property shows the maximum size of a file in bytes that you can manipulate - probably a very large number or unlimited. If it's not use the same command to change it to any number (e.g. 5000000 bytes) or unlimited:
$ulimit -S -f unlimited
After that you can create a large file for a test. This creates around 5GB file:
$dd if=/dev/zero of=outputfile bs=1M count=5


Resolutions
  • Compile your programs with "gcc -D_FILE_OFFSET_BITS=64" or "g++ -D_FILE_OFFSET_BITS=64" for C++ code. This forces all file access calls to use the 64 bit variants. It's important to always use the correct data types and to not use e.g. int that is 32 bit instead of off_t (file size). For portability with other platforms you should use getconf LFS_CFLAGS which will return -D_FILE_OFFSET_BITS=64 on Linux platforms but might return something else on e.g. Solaris. For linking, you should use the link flags that are reported via getconf LFS_LDFLAGS. On Linux systems, you do not need special link flags.
  • Define _LARGEFILE_SOURCE and _LARGEFILE64_SOURCE. With these defines you can use the LFS functions like open64 directly.
  • Use the O_LARGEFILE flag with open to operate on large files.

I chose the first solution and changed the build script that compiles all components (to links them into a shared binary library later) :

g++  -D_FILE_OFFSET_BITS=64 ...

No flag is needed during the linkage phase. This approach do not require code changes!


Implemetation

A sample C/C++ implementation follows:
- Import required headers:
#include "fcntl.h"
- Open file with name outputFileName for create/write:

int hOutputMovie = open(outputFileName, O_CREAT | O_WRONLY, S_IRWXU);
if (-1 == hOutputMovie)
{
loggerVpr.log("Problem opening file %s", outputFileNameChars);
...
}

- Write to the file some buffer of data (unsigned char *buffer) with specified length (int len):
if (len != write(hOutputMovie, buffer, len))
{
loggerVpr.log("\nUnable to write data to file with fd %d", hOutputMovie);
return false;
}

References

fcntl.h
Suse OS
AIX OS

Wednesday, January 16, 2008

Functions with a Variable Argument List II

After using functions with a variable argument list to implement a simple logging sollution I found out that those functions do a great job also in the exception handling.
When you want to throw an exception you usually implement a utility method like this:

void throwByName(JNIEnv *env, const char *name, const char *msg)
{
env->ExceptionDescribe();
env->ExceptionClear();

jclass cls = env->FindClass(name);
/* if cls is NULL, an exception has already been thrown */
if (cls != NULL)
{
env->ThrowNew(cls, msg);
}
/* free the local ref */
env->DeleteLocalRef(cls);
}


This method can be invoked in such a way:
char buffer[128];
throwByName(env, "com/exceptions/MyException", buffer);

The problem is that if you have a lot of methods and exception condition you have to copy/paste every time the exception class that is inconvenient. Also after the first refactoring you'll probably have to change the exception class on so much places.
To avoid those problems you can implement utility methods for all the exceptions that are used (or maybe the most often used) to store the exception class:
void throwMyException(JNIEnv * env, const char * errorMessage, ...)
{
char buffer[STACK_TRACE_SIZE];
va_list args;
va_start(args, errorMessage);
vsnprintf(buffer, sizeof(buffer), errorMessage, args);

throwByName(env, "com/exceptions/MyException", buffer);
}

The benefit is that you can throw that particular exception very easy and without much copy/paste:
throwMyException(env, "Error finishing encryption!");

You can also take a look at this post for more tips&trick for JNI.

Functions with a Variable Argument List

My current project is a JNI application - invokes C++ code from Java. I began implementation of a Logger component for the C++ part. As usually I wanted to integrate it very easy with just replacing the previous printing to stdout (printf) with a utility method (log) eg:
replace: printf("Error = %d (%s)", error, errorMessage);
with : log("Error = %d (%s)", error, errorMessage);
The signature of the log method is: log(const char * message, ...) that should print the log message to a file with: fprintf (file, messageString).

The big problem turned out to be - how to pass all the arguments from log to fprintf - both methods with variable argument list ...

Some ideas:
  1. All the tutorials learn you how to iterate those variable arguments and manipulate them separately - time consuming
  2. Overloade operator << - a lot of refactoring required.
  3. A cool idea was to redirect the System.out from Java to a file with System.setOut(printStream) - stdout is not the same as System.out
A possible sollution is to redirect the stdout from the C++ code to a file and skip the whole Logger component - the printf will append to the log file:
std::freopen(LOG_FILE, "a", stdout);
The big side effect is that the Java System.out stream is also redirected and writes to the file.

And the winner is:

#include "stdio.h"
#include "stdarg.h"
log(const char * message, ...) {
char buffer[512]
va_list args;
va_start(args, message);
// Returns the size of the created message
vsnprintf(buffer, sizeof(buffer), message, args);
fprintf (file, buffer);
}

vsnprintf formats the message with the argument list and writes it to the buffer that is easily logged.

Clean sollution but far from my Java-stuffed brain.

A solution for exception handling in JNI is in: Functions with a Variable Argument List II
You can also take a look at this post for more tips&trick for JNI.