Public Member Functions | Public Attributes | Private Attributes

cmn_FastPattSearch Class Reference
[G_new_group]

This is a class for fast pattern search through buffer. More...

#include <cmn_FastSearch.h>

Inheritance diagram for cmn_FastPattSearch:
Inheritance graph
[legend]
Collaboration diagram for cmn_FastPattSearch:
Collaboration graph
[legend]

List of all members.

Public Member Functions

 cmn_FastPattSearch (cmn_SkipTbl4FS *a_skipTbl_p)
 cmn_FastPattSearch (const char *const a_buffer_p, const int a_bufLen, cmn_SkipTbl4FS *a_skipTbl_p)
virtual ~cmn_FastPattSearch ()
void SetPattern (const string &a_pattern)
 set new search pattern
Int32_t Find ()
 NOTE! if return value is grater than buffer then string was not found the return value is willing handled in this way, because of using same algorithem with merging buffers.

Public Attributes

 log_CLASSID_m
 Macro to add class name member s_className.

Private Attributes

cmn_SkipTbl4FSm_skipTbl_p
 bad-character skip table instance, it is possible that is preconfigured in case of searching already known string

Detailed Description

This is a class for fast pattern search through buffer.

Base on ASCII table. Best for searc long pattern with no repeated characters in it.

Originaly written by

Author:
Boyer-Moore and optimized as Horspool algorithm
See also:
http://www-igm.univ-mlv.fr/~lecroq/string/node14.html#SECTION00140
http://www-igm.univ-mlv.fr/~lecroq/string/node18.html#SECTION00180

Definition at line 145 of file cmn_FastSearch.h.


Constructor & Destructor Documentation

cmn_FastPattSearch::cmn_FastPattSearch ( cmn_SkipTbl4FS a_skipTbl_p  ) 

Definition at line 75 of file cmn_FastSearch.cpp.

References log_FUNC_m.

            :
            cmn_FastSearch(NULL, 0),
            m_skipTbl_p(a_skipTbl_p)
{
    log_FUNC_m(cmn_FastPattSearch);
    // Empty
} 

cmn_FastPattSearch::cmn_FastPattSearch ( const char *const   a_buffer_p,
const int  a_bufLen,
cmn_SkipTbl4FS a_skipTbl_p 
)

Definition at line 87 of file cmn_FastSearch.cpp.

References log_FUNC_m.

            :
            cmn_FastSearch(a_buffer_p, a_bufLen),
            m_skipTbl_p(a_skipTbl_p)
{
    log_FUNC_m(cmn_FastPattSearch);
    //Empty
}

cmn_FastPattSearch::~cmn_FastPattSearch (  )  [virtual]

Definition at line 101 of file cmn_FastSearch.cpp.

References log_FUNC_m.

                                        {
    log_FUNC_m(~cmn_FastPattSearch);

/*DEL    if (m_skipTblConstructedInternaly) {
//DEL        delete [] m_bmBc_p;
        delete  m_skipTbl_p;
    }
*/
}


Member Function Documentation

Int32_t cmn_FastPattSearch::Find (  )  [virtual]

NOTE! if return value is grater than buffer then string was not found the return value is willing handled in this way, because of using same algorithem with merging buffers.

It's known how much buffer is need to be merged at front of next buffer.

Implements cmn_FastSearch.

Definition at line 137 of file cmn_FastSearch.cpp.

References cmn_SkipTbl4FS::GetPattern(), cmn_SkipTbl4FS::GetPatternLen(), cmn_SkipTbl4FS::GettSkipTbl(), ie_INVALID_ARG, log_FUNC_m, cmn_FastSearch::m_buffer_p, cmn_FastSearch::m_bufLen, m_skipTbl_p, and NULL.

                                 {
    log_FUNC_m(Find);

    if (m_buffer_p == NULL) {
        throw ivd_InternalError(ie_INVALID_ARG, "m_buffer_p == NULL. Call SetBuffer() first.");
    }

    UInt32_t    patLen   = m_skipTbl_p->GetPatternLen();
    UInt32_t    patLen_1 = patLen - 1;

    register const char *p    = m_buffer_p + patLen_1;
//DEL    register int         skip = 0;

    register const UInt16_t *b    = m_skipTbl_p->GettSkipTbl();
    const    char           *patt = m_skipTbl_p->GetPattern();

    register char    lastChar = patt[patLen_1];

    const char          *buff = m_buffer_p;

/* ut_cmn test "102", "Performance test, predefined skip table" on NT 
   1.372 sec    0.730 on Linux       Unit test code is  cmn_FastPattSrch_102
*/
    const char *limit = buff + m_bufLen;
    while (p < limit) {   // this is optimized not readable code  it si 14% faster
                           // see below in comment for original code 
       if (  (*p == lastChar) 
          && (memcmp(patt, p - patLen_1, patLen_1) == 0)) {
          return static_cast<Int32_t>(p - buff - patLen_1);
       }
       p += *(b + (UInt8_t)(*p));
    }
    return static_cast<Int32_t>( (p - buff - m_bufLen) - patLen );  // NOTE! if return value is negative then string was not found

/* DO NOT USE ut_cmn test "102", "Performance test, predefined skip table" on NT
   1.502 sec  1.312 with general optimizaton 
    const char *limit = buff + m_bufLen;
    while (p < limit) {   // this is optimized not readable code  it si 14% faster
                           // see below in comment for original code 
//>>> CORE DUMP
        skip = *(b + (UInt8_t)(*(p += skip)));
//    access to char out of buffer
       if (  (*p == lastChar) 
          && (memcmp(patt, p - patLen_1, patLen_1) == 0)) {
          return p - buff - patLen_1;
       }
    }
    return (p - buff - m_bufLen) - patLen;  // NOTE! if return value is negative then string was not found
*/

/* 2.082 sec
    char c   = *p;
    const char *limit = buff + m_bufLen;
    while (p < limit) {   // this is optimized not readable code  it si 14% faster
                           // see below in comment for original code 
       skip = *(b + (UInt8_t)(c = *(p += skip)));
       if (  (c == lastChar) 
          && (memcmp(patt, p - patLen_1, patLen_1) == 0)) {
          return p - buff - patLen_1;
       }
    }
    return (p - buff - m_bufLen) - patLen;  // NOTE! if return value is negative then string was not found
*/
/* 2.173 sec 
    char c   = *p;
    const char *limit = buff + m_bufLen;
    while (p < limit) {   // this is optimized not readable code  it si 14% faster
                           // see below in comment for original code 
       if (  (lastChar == (c = *(p += skip) ) ) 
          && (memcmp(patt, p - patLen_1, patLen_1) == 0)) {
          return p - buff - patLen_1;
       }
       skip = *(b + (UInt8_t)c );
    }
    return (p - buff - m_bufLen) - patLen;  // NOTE! if return value is negative then string was not found
*/

/* 2.234 sec 
    char c   = *p;
    unsigned int j = 0; // from begining of buffer
    unsigned int limit    = m_bufLen - m_patLen;
    while (j <= limit) {   // this is optimized not readable code  it si 14% faster
                           // see below in comment for original code 
       if (  (lastChar == (c = *(p += skip) ) ) 
          && (memcmp(patt, buff + j, patLen_1) == 0)) {
          return j;
       }
       j += (skip = *(b + (UInt8_t)c ) );
    }
    return j - m_bufLen;  // NOTE! if return value is negative then string was not found
*/
/* 2.213    ORIGINAL code
    unsigned int j = 0; // from begining of buffer
    unsigned int limit = m_bufLen - m_patLen;
    while (j <= limit) {
       char c = *p;       
       if (  (lastChar == c) 
          && (memcmp(patt, buff + j, patLen_1) == 0)) {
          return j;
       }
       int skip = *(b + (UInt8_t)c );
       j += skip;
       p += skip; 
    }
    return j - m_bufLen;  // NOTE! if return value is negative then string was not found
*/
}

Here is the call graph for this function:

void cmn_FastPattSearch::SetPattern ( const string &  a_pattern  )  [inline, virtual]

set new search pattern

Implements cmn_FastSearch.

Definition at line 171 of file cmn_FastSearch.h.

{m_skipTbl_p->SetPattern(a_pattern);};


Member Data Documentation

Macro to add class name member s_className.

Reimplemented from cmn_FastSearch.

Definition at line 166 of file cmn_FastSearch.h.

bad-character skip table instance, it is possible that is preconfigured in case of searching already known string

Definition at line 163 of file cmn_FastSearch.h.

Referenced by Find().


The documentation for this class was generated from the following files: