I received this simple (I really thought it was!) challenge of creating a "tokenizer". I had to split string " O rato roeu a roupa do rei de roma "
into spaces. So, after a long time, I developed the following algorithm using vectors, the header algorithm and string
s.
#ifndef STR_PARSE_HPP
#define STR_PARSE_HPP
#include <algorithm>
#include <string>
#include <vector>
using std::reverse;
using std::vector;
using std::string;
vector<string> split(string str, char delimiter = ' ')
{
vector<string> ret;
if((str.find(delimiter) == string::npos) && (str.find_first_not_of(delimiter) == string::npos)) throw nullptr;
else if ((str.find(delimiter) == string::npos)) ret.push_back(str);
else if(str.find_first_not_of(delimiter) == string::npos) ret.push_back(string(""));
else
{
unsigned i = 0;
string strstack;
while(str[0] == delimiter) {str.erase(0,1);}
reverse(str.begin(), str.end());
while(str[0] == delimiter) {str.erase(0,1);}
reverse(str.begin(), str.end());
while(!str.empty())
{
ret.push_back(str.substr(i, str.find(delimiter)));
str.erase(0,str.find(delimiter));
while(str[0] == delimiter) {str.erase(0,1);}
}
}
return ret;
}
#endif // STR_PARSE_HPP
The test:
#include <iostream>
#include "str_parse.hpp"
using std::string;
using std::cout;
int main()
{
string a = " O rato roeu a roupa do rei de roma ";
for(int i = 0; i < split(a).size(); i++)
cout << split(a)[i];
}
The output was as expected:
O
rato
roeu
a
roupa
do
rei
de
roma
So, since I lost a "bit" of time, I decided to test with other delimiters. The crash is instantaneous, and the debugger here is "spoiled" (the breakpoints pass straight). What's wrong with my code?