Return all CSS classes with Regular Expression

5

I need to return all classes found inside a CSS string, so that when the expression conflicts with:

div.classe1{/*...*/}
.classe2 div a{/*...*/}
.classe3.classe4{/*...*/}
.classe5{/*...*/}

Return in an array (with or without the dots before the class name, whatever):

["classe1","classe2","classe3","classe4","classe5"]

What I've tried so far was this code:

\.(-?[_a-zA-Z]+[_a-zA-Z0-9-]*)(?![^\{]*\})

But apparently it did not work very well ...

FIDDLE for a better understanding.

    
asked by anonymous 24.02.2014 / 21:14

5 answers

5

My suggestion is to use a full CSS parser
, such as JSCSSP , and extract the classes from individual selectors (instead of the entire CSS text).

function extrairClasses(css) {
    var classes = [];

    var parser = new CSSParser();
    var sheet = parser.parse(css, false, true);
    for ( var i = 0 ; i < sheet.cssRules.length ; i++ ) {
        var seletor = sheet.cssRules[i].mSelectorText;
        if ( seletor )
            classes = classes.concat(seletor.match(/\.\w+/g));
    }

    return classes;
}

Example in jsFiddle . Note that it works even in the presence of "degenerate" cases, such as a comment containing " or a string containing /* (and both containing .classe ).     

24.02.2014 / 23:26
1

Follows pure JS implementation:

var texto = "div.classe1{background:red}.classe2 div a{background:#00f}.classe3.classe4{background:green}.classe5{background:#ff0}";

var retorno = texto.match(/\.(-?[_a-zA-Z]+[_a-zA-Z0-9-]*)(?![^\{]*\})/ig); //[".classe1", ".classe2", ".classe3", ".classe4", ".classe5"]

Note the inclusion of /ig at the end that searches the entire string (not just the first occurrence) in case insensitive.

    
24.02.2014 / 22:10
1

Thinking about handling comments and hacks that start with dot, I thought of the following function:

function parseClasses(cssSource) {

    //remove comentários
    var semComentarios = cssSource.replace(/\/\*([\s\S]*?)\*\//g, '');

    //remove blocos com a formatação
    var semBlocos = semComentarios.replace(/{\s*[^}]*\s*}/g, '');

    //recupera classes no que restou
    return semBlocos.match(/\.-?[_a-zA-Z]+[_a-zA-Z0-9-]*/g)

}

Example usage:

var classes = parseClasses(str);
for (var i = 0; i < classes.length; i++) {
    console.log(classes[i])
}

Demo on jsfiddle

    
24.02.2014 / 22:11
0

Using regular expressions, /([^{]+)\s*\{[^}]*}/ satisfies all selectors and /\.(-?[a-z_]+[a-z0-9-_]*)/ all classes. Example:

var style = ".class1 { text-align: left; font-weight: bold} " +
            "/* -- Comment -- */" +
            ".class2.class3 { border: 1px solid #a1a1a1 } " +
            "textarea.class4 { height: 400px } " +
            ".class5 ~ .class6 { display: none } " +
            ".class7 .class8:first-child { color: #ababab } ";
function parseCss(stylesheet) {
    var stylesheetPattern = /([^{]+)\s*\{[^}]*}/gm,
        selectorPattern = /\.(-?[a-z_]+[a-z0-9-_]*)/ig,
        classes = new Array(),
        selector, match;
    stylesheet = stylesheet.replace(/(\/\*.*(?!=\*\/)\*\/)/gm, "");
    while(match = stylesheetPattern.exec(stylesheet)) {
        while(selector = selectorPattern.exec(match[1])) {
            classes.push(selector[1]);
        }
    }
    return classes;
}
parseCss(style);

In this case, parseCss () returns [ "class1", "class2", "class3", "class4", "class5", "class6", "class7", "class8" ] .

    
25.02.2014 / 10:56
0

We can parse string character by character to eliminate classes that fall within comments /* .classe */ , is the best way to extract all CSS classes from a string.

$string = <<<EOF
div.classe1{/*comentario div.classe1b*/}
div.-classe2{/*div.-classe2b*/}
div._classe3{/*div._classe3b*/}
.classe4 div a{/*.classe4b div a*/}
.classe5.classe6{/*.classe5b.classe6b*/}
.classe7{/*......classe7b......*/}
.classe8{esse aqui nao tem comentarios mas tambem nao pega o .classe8b pois esta dentro das chaves}
.cl{/*.clb 2 caracteres*/}
.c{/*.cb 1 caractere*/}
.d{/*.db 1 caractere*/}
EOF;

$length = strlen( $string );

$brackets = false;
$comment = false;
$dot = false;
$class = '';
$classes = array();

for ( $i = 0, $j = 0; $i < $length; $i++ ) {
  if ( $string[ $i ] === "\x2f" && $string[ $i + 1 ] === "\x2a" ) {
    $comment = true;
    continue;
  } else if ( $string[ $i ] === "\x7b" ) {
    $brackets = true;
    continue;
  } else if ( $brackets === false && $comment === false && $string[ $i ] === "\x2e" ) {
    $dot = true;
    continue;
  } else if ( $string[ $i ] === "\x2a" && $string[ $i + 1 ] === "\x2f" ) {
    $comment = false;
    continue;
  } else if ( $brackets === true && $string[ $i ] === "\x7d" ) {
    $brackets = false;
    continue;
  }
  if ( $dot ) {
    $j = $i + 1;
    $k = $j;
    if ( ( ( $string[ $i ] >= "\x41" && $string[ $i ] <= "\x5a" ) || ( $string[ $i ] >= "\x61" && $string[ $i ] <= "\x7a" ) || ( $string[ $i ] === "\x2d" ) || ( $string[ $i ] === "\x5f" ) ) === false ) {
      $class = '';
      $dot = false;
      continue;
    }
    $class = $string[ $i ];
    while ( ( $string[ $j ] >= "\x30" && $string[ $j ] <= "\x39" ) || ( $string[ $j ] >= "\x41" && $string[ $j ] <= "\x5a" ) || ( $string[ $j ] >= "\x61" && $string[ $j ] <= "\x7a" ) || ( $string[ $j ] === "\x2d" ) || ( $string[ $j ] === "\x5f" ) ) {
      $class .= $string[ $j ];
      $j++;
    }
    array_push( $classes, $class );
    $class = '';
    $dot = false;
    $i = $j - 1;
  }
}

echo '<pre style="font-size: 14px; font-family: Consolas; line-height: 20px; tab-size: 4;">';
var_export( $classes );
echo '</pre>';
die();

The result obtained from the var_export function is array with all classes, and since the objective is to remove classes that will be present within comments, then this is achieved successfully and also classes are removed that exist for whatever reason inside keys (maybe this does not work well with @media css ), but I added this last because I'm assuming your code is a "normal" code without @media css , if there is presence of% I just added the parts with @media , in the original code I made I did not have brackets , I have added last.

I made the code now after reading the question, I made quick tests so I can not assure you that it is working 100%, but you are extracting classes that start with brackets or - or _ or a-z and proceed or not of A-Z or - or _ or a-z or A-Z .

array (
  0 => 'classe1',
  1 => '-classe2',
  2 => '_classe3',
  3 => 'classe4',
  4 => 'classe5',
  5 => 'classe6',
  6 => 'classe7',
  7 => 'classe8',
  8 => 'cl',
  9 => 'c',
  10 => 'd',
)

PS: I know the category is Javascript and the published code is PHP, but I made a point of answering because it is the correct answer to your question and the one that obtains the best results, and also because the similarity in syntax and PHP and JS functions, to "convert" to Javascript, only minimal adaptations will be necessary. I hope I have helped.

    
25.02.2014 / 08:56