tirsdag 7. august 2012

Performance of memory mapped files for streams

Memory mapped files, or mmap-files map the file into virtual memory, allowing to address the files data as if it is in memory, without the overhead of actually copying everything into memory. Mmap-files are part of POSIX and therefore available on practically all platforms. I investigated the performance of mmap-files for numerical weather prediction files, i.e. files larger 1GB, where large chunks of data must be read as a stream.

The library to read the data uses many seekg and read on C++ iostreams. By switching the C++ iostream to an boost::iostream::device::mapped_file_source the program could be used without larger changes:

        using namespace boost::iostreams;
        typedef stream mmStream;
        mmStream* mmistream = new mmStream();
        mmistream->open(mapped_file_source(fileName));
        feltFile_ = mmistream;

This worked as expected, and the feltFile_ was working with the old data-reading commands:
    feltFile_->seekg(pos, ios_base::beg);
    feltFile_->read((char*) ret.get(), blockWords * sizeof(word));
On my 32bit platform, I ran into the first problem with files larger 2GB. mmap files need a address-space which is larger than the file-size, that is on linux 32bit ~3.5GB. For files between 2GB and 3.5GB, there is a problem with the file-size pointer of iostream, which seems to use a signed int.

64bit platforms are around the corner, so I continued my test with 1.5GB files, ignoring the 32bit issues. For compatibility, my code just catches the mmap-exception and reopens the large files as standard-streams.

Performance measurements showed no benefit on using mmap-files or standard streams, even when reading the same file in parallel. There was even a slight, insignificant better performance for std::streams. Mmap files don't seem to make sense for stream-like readers, even if it is very simple to switch from the one to the other. I guess, mmap makes more sense where the file otherwise would be slurped into memory. I removed the use of mapped_files after the test, but just to simplyfy the code.

fredag 11. mai 2012

Accessing a Fortran-library from C

I recently had to work with a scientist who had written a nice library in his preferred language, Fortran. The library was useful, we wanted to publish the results on our webservers and had thus to connect it to apache. Unfortunately, apache isn't written in Fortran, so we had to create a C-wrapper around the Fortran code.

This starts quite simple. A subroutine like
subroutine calcThis(in, out)
real, intent(IN) :: in
real, intent(OUT):: out
can be called from C like
callthis_(&in, &out);
Note here, that fortran-functions appear in lower-case and with usually one underscore in C, and all, even input variables are given by reference.

The problem starts when asking the question: What is this real, is it float or double? The sad answer is any, because the type is defined by the compiler, or rather, the compiler options, e.g. for gcc, -fdefault-real-8 means to use real*8 internally, which can be 8bit or 16bit, depending on architecture and controlled by -fdefault-double-8. Since Fortran2003, this can be much better controlled using the ISO_C_BINDING, which allow to use explicitly C-datatypes like real, KIND(C_DOUBLE)
This was originally thought to help fortran access C-libraries, but it also helps the other way around. Unfortunately, at least in gcc-4.4, it does not check if the datatypes are used inconsistently, so it is possible to intermix a 4bit real with an 8bit real and so on.

Strings are even more tricky. While the C-String is only a character array with a final 0 (char*), a fortran string is a object, with internally defined length. While the calling convention with intermixed C is not defined, it is often (gfortran, pgf), used the following way:
fortranfunc_(char* string1, char* string2, int strLen1, int strLen2)
i.e., the length of the strings are the last arguments of the function. This can get very complicated, e.g. when variable length-arrays of strings are used, e.g.
integer, intend(IN) :: aryLen
character*256, intend(IN) :: strAry(aryLen)
Also here, the ISO_C_BINDING can help:
integer, intend(IN) :: aryLen
character, kind(LEN=1, C_CHAR) :: strAry(aryLen*256)

The length is always 1, since it sends only a pointer. The allocated length is then the length of a single string times the array-length.
Unfortunately gfortran (and I think other compilers, too), don't convert between the string-types automatically, and copying a C-string to a fortran string needs special care. On the net, I have found some Fortran functions, which should do so, but I didn't manage to get them compiled and I had to change them to Fortran subroutines to work:
SUBROUTINE C_F_STRING (C_STR, F_STRING)
USE, INTRINSIC :: ISO_C_BINDING, ONLY: C_CHAR, C_NULL_CHAR
IMPLICIT NONE
CHARACTER(LEN=1,KIND=C_CHAR), intent(IN) :: C_STR
CHARACTER(LEN=*), intent(INOUT)          :: F_STRING ! output
INTEGER                      :: I, N
LOGICAL                      :: SETVAL
   
N = LEN(F_STRING)
SETVAL = .true.
DO I = 1, N
   IF (C_STR(I:I) .EQ. C_NULL_CHAR) THEN
       SETVAL = .false.
       F_STRING = F_STRING(:I-1)
   END IF
   IF (SETVAL) THEN
       F_STRING(I:I) = C_STR(I:I)
   ELSE
       GOTO 100
   END IF
END DO
100   CONTINUE

END subroutine C_F_STRING


SUBROUTINE F_C_STRING (F_STRING, C_STRING)
USE, INTRINSIC :: ISO_C_BINDING, ONLY: C_CHAR, C_NULL_CHAR
IMPLICIT NONE
CHARACTER(LEN=*), INTENT(IN) :: F_STRING
CHARACTER(LEN=1,KIND=C_CHAR), INTENT(INOUT) :: C_STRING(LEN(F_STRING)+1)
INTEGER                      :: N, I

N = LEN_TRIM(F_STRING)
DO I = 1, N
   C_STRING(I) = F_STRING(I:I)
END DO
C_STRING(N + 1) = C_NULL_CHAR

END SUBROUTINE F_C_STRING

It should be noted that the C_String must have allocated 1 byte more than the fortran strings, due to the additional ending 0. The subroutines are then used like:
! initialization
CHARACTER(LEN=1, KIND=C_CHAR), INTENT(INOUT) :: rep250(251*maxrep)
CHARACTER(LEN=1, KIND=C_CHAR), INTENT(IN) :: desc100(101)
CHARACTER(LEN=250) :: f_rep250(maxrep)
CHARACTER(LEN=100) :: f_desc100

! convert to fortran strings
CALL C_F_STRING(desc100,f_desc100)
! the array case
DO i = 1, maxrep
   CALL C_F_STRING(rep250((i-1)*251+1) ,f_rep250(i))
END DO

! do something with the fortran strings
...

! convert to C-strings
CALL F_C_STRING(f_desc100, desc100)
DO i = 1, maxrep
   CALL F_C_STRING(f_rep250(i), rep250((i-1)*251+1))
END DO
Since this might require a lot of additional wrapper-code to be written, it should be automatized. In my cases, I wrote a pseudo header including tags like
FWRAP(real workWithStrings)[
                      IN integer maxrep,
                      INOUT character*250(maxrep) rep250,
                      IN character*100 desc100]
which got processed by a perl-skript, writing both the C header-file, and the f90 wrapper-functions. Good information about the ISO_C_BINDING can be found in the IBM Linux Compilers handbook.