In other languages there is split
, explode
or something similar that chucks string into chunks according to some separator. Is there something C ready or do I have to do on hand?
In other languages there is split
, explode
or something similar that chucks string into chunks according to some separator. Is there something C ready or do I have to do on hand?
There's something so ready, but there's strtok()
that parses string and replaces a delimiter specified by a null character and so what was a single string will become several, since the null ends the string at that point .
But note that it does not return an array of strings as is common in other languages, so it does not do all delimiters. It does only with the first one you find, so in the second you need to strtok()
wheel again and so on. Of course, every C programmer does some utilitarian function (s) to facilitate and deliver what you want.
#include <stdio.h>
#include <string.h>
int main(void) {
char frutas[] = "banana,laranja,morango";
int tamanho = strlen(frutas); //isto funciona só para delimitador de 1 caractere
char *token = strtok(frutas, ",");
for (int i = 0; i < tamanho; i++) {
printf(token[i] == 0 ? "\0" : "%c", token[i]);
}
while(token != NULL) {
printf("\n%s", token);
token = strtok(NULL, ",");
}
}
See running on ideone . And at Coding Ground . Also put it on GitHub for future reference .
In C ++ there is no native 'split' function for strings.
Searching the subject finds a huge variety of ways to separate a string.
Some examples I find interesting.
#include <iostream>
#include <sstream>
#include <string>
#include <vector>
using namespace std;
int main()
{
// string a ser separada
string tokenString { "aaa bbb ccc" };
// as sub-strings separadas vão ser colocadas neste vetor
vector<string> tokens;
// stream de strings de input inicializado com a string a ser separada
istringstream tokenizer { tokenString };
// variável de trabalho
string token;
// separa as string por espaço e coloca no vetor destino
while (tokenizer >> token)
tokens.push_back(token);
// mostra na tela as sub-strings separadas
for (const string& token: tokens)
cout << "* [" << token << "]\n";
}
Result of example 1:
* [aaa]
* [bbb]
* [ccc]
#include <iostream>
#include <sstream>
#include <string>
#include <vector>
using namespace std;
int main()
{
// string a ser separada
string tokenString { "aaa, bbb, ccc,,ddd , eee" };
// as sub-strings separadas vão ser colocadas neste vetor
vector<string> tokens;
// stream de strings de input inicializado com a string a ser separada
istringstream tokenizer { tokenString };
// variável de trabalho
string token;
// separa as sub-strings por vírgula e coloca no vetor destino
while (getline(tokenizer, token, ','))
tokens.push_back(token);
// mostra na tela as sub-strings separadas
for (const string& token: tokens)
cout << "* [" << token << "]\n";
}
Result of example 2:
* [aaa]
* [ bbb]
* [ ccc]
* []
* [ddd ]
* [ eee]
Note that the spaces in the destination sub-strings have been kept. (It would be the case to use another common function for strings called 'trim' that does does not exist in C ++).
#include <iostream>
#include <regex>
#include <string>
#include <vector>
using namespace std;
int main()
{
// string a ser separada
string tokenString { "aaa, bbb, ccc,,ddd , eee" };
// as sub-strings separadas vão ser colocadas neste vetor
vector<string> tokens;
// expressão regular contendo os delimitadores: espaço e vírgula
regex delimiters { "[\s,]+" };
// cria um iterator para um objeto contendo as sub-strings separadas
// obs. estou usando uma "receita" pronta, não sei o motivo exato do parametro '-1'
sregex_token_iterator tokens_begin { tokenString.begin(), tokenString.end(), delimiters, -1 };
// iterator finalizador
auto tokens_end = sregex_token_iterator {};
// copia as sub-strings separadas para o vetor destino
for (auto token_it = tokens_begin; token_it != tokens_end; token_it++)
tokens.push_back(*token_it);
// mostra na tela as sub-strings separadas
for (const string& token: tokens)
cout << "* [" << token << "]\n";
}
Result of example 3:
* [aaa]
* [bbb]
* [ccc]
* [ddd]
* [eee]
That's all for now folks.