Public Member Functions | Static Public Member Functions | Private Attributes

cmn_UTF8Converter Class Reference
[Common, basic classes, functions and types]

Converts from a specified code page to UTF-8 and back. More...

#include <cmn_utf8.h>

Collaboration diagram for cmn_UTF8Converter:
Collaboration graph
[legend]

List of all members.

Public Member Functions

 cmn_UTF8Converter (const string &a_codeSet)
virtual ~cmn_UTF8Converter ()
string ConvertToUTF8 (const string &a_source)
string ConvertFromUTF8 (const string &a_utf8)

Static Public Member Functions

static const char * InitCodeSet ()

Private Attributes

 log_CLASSID_m
string m_codeset
cmn_Mutex m_converterToUTF8_x
cmn_Mutex m_converterFromUTF8_x
iconv_t m_fromUTFiconvh
iconv_t m_toUTFiconvh

Detailed Description

Converts from a specified code page to UTF-8 and back.

Uses cmn_UTF8 as input/output.

Definition at line 55 of file cmn_utf8.h.


Constructor & Destructor Documentation

cmn_UTF8Converter::cmn_UTF8Converter ( const string &  a_codeSet  ) 

Definition at line 86 of file cmn_utf8.cpp.

References errno, log_ERR_m, log_FUNC_m, m_codeset, m_fromUTFiconvh, m_toUTFiconvh, and UTF8_c().

  : m_codeset(a_codeSet) {

  if (a_codeSet.empty()) {
      log_FUNC_m(cmn_UTF8Converter);
      log_ERR_m(
          "Codeset wasn't detected properly. Please check system sesttings.");
  }

#if IVD_POSIX_OS
    // Initialization of conversion

    if (m_codeset == UTF8_c) {
    return;
    }

    m_toUTFiconvh = iconv_open(UTF8_c, a_codeSet.c_str());

    if (m_toUTFiconvh == iconv_t(-1)) {
        throw ivd_SysError(
            errno, string("iconv_open(): source code set") + a_codeSet );
    }

    m_fromUTFiconvh = iconv_open(a_codeSet.c_str(), UTF8_c);

    if (m_fromUTFiconvh == iconv_t(-1)) {
        throw ivd_SysError(
            errno, string("iconv_open(): target code set") + a_codeSet );
    }
#endif
}

Here is the call graph for this function:

cmn_UTF8Converter::~cmn_UTF8Converter (  )  [virtual]

Definition at line 118 of file cmn_utf8.cpp.

References m_codeset, m_fromUTFiconvh, m_toUTFiconvh, and UTF8_c().

                                      {
#if IVD_POSIX_OS
   if (m_codeset == UTF8_c) {
       return;
   }
   iconv_close(m_fromUTFiconvh);
   iconv_close(m_toUTFiconvh);
#endif
}

Here is the call graph for this function:


Member Function Documentation

string cmn_UTF8Converter::ConvertFromUTF8 ( const string &  a_utf8  ) 

Definition at line 225 of file cmn_utf8.cpp.

References cmn_HexDump(), errno, ie_FATAL_ERROR, ie_NYI, log_FUNC_m, log_WRN_m, m_codeset, m_converterFromUTF8_x, m_fromUTFiconvh, NULL, and UTF8_c().

Referenced by cmn_UTF8ToLocale().

                                                              {

#if IVD_POSIX_OS

    if (m_codeset == UTF8_c || a_utf8.empty()) {
        return a_utf8;
    }

    string dest((a_utf8.length()+1), '\0');
    char* src_p = const_cast<char*>(a_utf8.c_str());
    char* dest_p = const_cast<char*>(dest.data());
    size_t inbytesleft = a_utf8.length();
    size_t outbytesleft = dest.length();

    size_t conversions;
    {  // iconv must be under mutex
        cmn_MutexLock l(m_converterFromUTF8_x);
        conversions = iconv(m_fromUTFiconvh,
            &src_p, &inbytesleft,
            &dest_p, &outbytesleft);

        if (conversions == size_t(-1)) {
            log_FUNC_m(ConvertFromUTF8);
            int convError = errno;
            // Resets conversion handle to the initial state so
            // that next conversion won't have problems.
            iconv(m_fromUTFiconvh, NULL, NULL, NULL, NULL);
            throw ivd_SysError(
                convError, string("iconv(...): \'") + a_utf8 + "\'");
        }
    }

    if (inbytesleft > 0) {
        log_FUNC_m(ConvertFromUTF8);
        if (outbytesleft == 0) {
            throw ivd_InternalError(ie_NYI, "dest is too small.");
        }
        else {
            log_WRN_m(
                "Incorrect input buffer when converting from UTF-8: " <<
                a_utf8
            );
        }
    }

    //
    // It is assumed that the data format contains zero-terminated strings.
    // Otherwise something is wrong.
    //
    UInt32_t convsize = dest.length() - outbytesleft;

    // Works properly for zero-terminated strings and prevents
    // accessing invalid memory for others.
    if (dest.length() >= outbytesleft) {
        dest.resize(convsize);
    }
    else {
        log_FUNC_m(ConvertFromUTF8);
        ostringstream sstr;
        sstr <<"dest.length() is less than outbytesleft.  convsize = " << convsize
                        << ", dest.length() = " << dest.length()
                        << " outbytesleft = " << outbytesleft
                        << " source = " << cmn_HexDump(
                            a_utf8.c_str(), a_utf8.length(), 16, true)
                        << " dest str = " << dest;
        throw ivd_InternalError(ie_FATAL_ERROR, sstr.str());
    }

    return dest;

#else

    // Windows: Dummy conversion; just create a string from the buffer.

    return a_utf8;
#endif

}

Here is the call graph for this function:

Here is the caller graph for this function:

string cmn_UTF8Converter::ConvertToUTF8 ( const string &  a_source  ) 

Definition at line 152 of file cmn_utf8.cpp.

References cmn_HexDump(), cmn_MAX_UTF8_CHAR_SIZE_c, errno, ie_FATAL_ERROR, ie_NYI, log_FUNC_m, log_WRN_m, m_codeset, m_converterToUTF8_x, m_toUTFiconvh, NULL, and UTF8_c().

Referenced by cmn_LocaleToUTF8().

                                                              {

 #if IVD_POSIX_OS

    if (m_codeset == UTF8_c) {
        return a_source;
    }

    string dest((a_source.length()+1)*cmn_MAX_UTF8_CHAR_SIZE_c, '\0');

    char *src_p = const_cast<char*>(a_source.c_str());
    char *dest_p = const_cast<char*>(dest.data());
    size_t inbytesleft = a_source.length();
    size_t outbytesleft = dest.length();

    size_t conversions;
    {  // iconv must be under mutex
        cmn_MutexLock l(m_converterToUTF8_x);
        conversions = iconv(m_toUTFiconvh,
            &src_p, &inbytesleft,
            &dest_p, &outbytesleft
        );

        if (conversions == size_t(-1)) {
            log_FUNC_m(ConvertToUTF8);
            int convError = errno;
            // Resets conversion handle to the initial state so
            // that next conversion won't have problems.
            iconv(m_toUTFiconvh, NULL, NULL, NULL, NULL);
            throw ivd_SysError(
                convError, string("iconv(...): \'") + a_source + "\'");
        }
    }
    if (inbytesleft > 0) {
        log_FUNC_m(ConvertToUTF8);
        if (outbytesleft == 0) {
            throw ivd_InternalError(ie_NYI, "dest is too small.");
            // Buffer will be expanded and we retry
        }
        else {
            log_WRN_m(
                "Incorrect input buffer when converting to UTF-8: " <<
                a_source
            );
        }
    };

    UInt32_t tgtSize = dest.length() - outbytesleft;
    if (dest.length() >= outbytesleft) {
        dest.resize(tgtSize);
    }
    else {
        log_FUNC_m(ConvertToUTF8);
        ostringstream sstr;
        sstr <<"dest.length() is less than outbytesleft.  tgtSize = " << tgtSize
                        << ", dest.length() = " << dest.length()
                        << " outbytesleft = " << outbytesleft
                        << " source = " << cmn_HexDump(
                            a_source.c_str(), a_source.length(), 16, true)
                        << " dest str = " << dest;
        throw ivd_InternalError(ie_FATAL_ERROR, sstr.str());
    }
    return dest;

#else

    // Windows: Dummy conversion (just copy data).

    return a_source;

#endif
}

Here is the call graph for this function:

Here is the caller graph for this function:

const char * cmn_UTF8Converter::InitCodeSet (  )  [static]

Definition at line 128 of file cmn_utf8.cpp.

References NULL.

Referenced by cmn_SysInfo::GetSystemData().

                                           {

#if IVD_POSIX_OS
    const char* locale = setlocale(LC_CTYPE, "");

    if (locale == NULL) {
        return "";
    }

    const char* codeset = nl_langinfo(CODESET);

    if (codeset == NULL) {
        return "";
    }

    return codeset;

#elif TGT_OS_windows
    return "UTF-16";
#else
    #error "Unknown/unsupported platform."
#endif
}

Here is the caller graph for this function:


Member Data Documentation

Definition at line 66 of file cmn_utf8.h.

string cmn_UTF8Converter::m_codeset [private]

Definition at line 69 of file cmn_utf8.h.

Referenced by cmn_UTF8Converter(), ConvertFromUTF8(), ConvertToUTF8(), and ~cmn_UTF8Converter().

Definition at line 73 of file cmn_utf8.h.

Referenced by ConvertFromUTF8().

Definition at line 72 of file cmn_utf8.h.

Referenced by ConvertToUTF8().

Definition at line 74 of file cmn_utf8.h.

Referenced by cmn_UTF8Converter(), ConvertFromUTF8(), and ~cmn_UTF8Converter().

Definition at line 75 of file cmn_utf8.h.

Referenced by cmn_UTF8Converter(), ConvertToUTF8(), and ~cmn_UTF8Converter().


The documentation for this class was generated from the following files: