jubatus_core  0.1.2
Jubatus: Online machine learning framework for distributed environment
Public Member Functions | Private Attributes | List of all members
jubatus::core::fv_converter::character_ngram Class Reference

#include <character_ngram.hpp>

Inheritance diagram for jubatus::core::fv_converter::character_ngram:
Inheritance graph
Collaboration diagram for jubatus::core::fv_converter::character_ngram:
Collaboration graph

Public Member Functions

 character_ngram (size_t length)
 
void split (const std::string &string, std::vector< std::pair< size_t, size_t > > &ret_boundaries) const
 
- Public Member Functions inherited from jubatus::core::fv_converter::word_splitter
void extract (const std::string &text, std::vector< string_feature_element > &result) const
 
 word_splitter ()
 
virtual ~word_splitter ()
 
- Public Member Functions inherited from jubatus::core::fv_converter::string_feature
virtual ~string_feature ()
 

Private Attributes

const size_t length_
 

Detailed Description

Definition at line 29 of file character_ngram.hpp.

Constructor & Destructor Documentation

jubatus::core::fv_converter::character_ngram::character_ngram ( size_t  length)
inlineexplicit

Definition at line 31 of file character_ngram.hpp.

32  : length_(length) {
33  }

Member Function Documentation

void jubatus::core::fv_converter::character_ngram::split ( const std::string &  string,
std::vector< std::pair< size_t, size_t > > &  ret_boundaries 
) const
virtual

Implements jubatus::core::fv_converter::word_splitter.

Definition at line 35 of file character_ngram.cpp.

References length_.

37  {
38  const size_t len = length_;
39  std::vector<size_t> queue(len);
40  size_t p = 0;
41  size_t n = 0;
42 
43  std::vector<std::pair<size_t, size_t> > bounds;
44  for (size_t i = 1; i <= string.size(); ++i) {
45  if (i == string.size() || is_begin_of_character(string[i])) {
46  ++n;
47  if (n >= len) {
48  size_t b = queue[p];
49  bounds.push_back(std::make_pair(b, i - b));
50  }
51  queue[p] = i;
52  ++p;
53  if (p == len) {
54  p = 0;
55  }
56  }
57  }
58 
59  bounds.swap(ret_boundaries);
60 }

Member Data Documentation

const size_t jubatus::core::fv_converter::character_ngram::length_
private

Definition at line 40 of file character_ngram.hpp.

Referenced by split().


The documentation for this class was generated from the following files: