Delete total lines

3

I have the following structure of a database:

MES EST.DET1 EST.DET2 EST.DET3 DIAS
2  Curso 1  Turma A    Manha    5
2  Curso 1  Turma A    Tarde    5
2  Curso 1  Turma B     <NA>    5
2  Curso 1     <NA>     <NA>   15
2  Curso 2  Turma A     <NA>    7
2  Curso 2     <NA>     <NA>    7
2  Curso 3     <NA>     <NA>   10
3  Curso 1  Turma A    Manha    6
3  Curso 1  Turma A    Tarde    6
3  Curso 1  Turma B     <NA>    6
3  Curso 1     <NA>     <NA>   18
3  Curso 2  Turma A     <NA>    7
3  Curso 2     <NA>     <NA>    7
3  Curso 3     <NA>     <NA>   13
4  Curso 1  Turma A    Manha    5
4  Curso 1  Turma A    Tarde    5
4  Curso 1  Turma B     <NA>    5
4  Curso 1     <NA>     <NA>   15
4  Curso 2  Turma A     <NA>    6
4  Curso 2     <NA>     <NA>    6
4  Curso 3     <NA>     <NA>   10

Basically, there are 3 courses that occur over three months, with courses 1 and 2 structures "daughters". Course 1, has 2 classes (A and B), and A can happen in the morning or afternoon. Course 2 has only class A and course 3 does not have detailed "daughter" structures.

The 4th line and the corresponding ones of Course 1 for the other months, is nothing more than the totalizer (sum) of the "daughter" structures. The same goes for the 6th line (Course 2).

Is there any way to filter my database so that these totems are deleted? (It should be noted that Course 3 should be maintained)

    
asked by anonymous 28.11.2017 / 20:44

1 answer

3

The simplest way is to use a logical vector to select its rows:

data <- c("2", "Curso 1", "Turma A", "Manha", "5",
          "2", "Curso 1", "Turma A", "Tarde", "5",
          "2", "Curso 1", "Turma B", "",      "5",
          "2", "Curso 1", "",        "",      "15",
          "2", "Curso 2", "Turma A", "",      "7",
          "2", "Curso 2", "",        "",      "7",
          "2", "Curso 3", "",        "",      "10",
          "3", "Curso 1", "Turma A", "Manha", "6",
          "3", "Curso 1", "Turma A", "Tarde", "6",
          "3", "Curso 1", "Turma B", "",      "6",
          "3", "Curso 1", "",        "",      "18",
          "3", "Curso 2", "Turma A", "",      "7",
          "3", "Curso 2", "",        "",      "7",
          "3", "Curso 3", "",        "",      "13",
          "4", "Curso 1", "Turma A", "Manha", "5",
          "4", "Curso 1", "Turma A", "Tarde", "5",
          "4", "Curso 1", "Turma B", "",      "5",
          "4", "Curso 1", "",        "",      "15",
          "4", "Curso 2", "Turma A", "",      "6",
          "4", "Curso 2", "",        "",      "6",
          "4", "Curso 3", "",        "",      "10")

data <- data.frame(matrix(data, ncol = 5, byrow = TRUE))

names(data) <- c("MES", "EST.DET1", "EST.DET2", "EST.DET3", "DIAS")

In the execution below, I select the lines that are not empty in column 3 or that contain 'Course 3' in column 2.

data[data[,3] != '' | data[,2] == 'Curso 3', ]

Result:

   MES EST.DET1 EST.DET2 EST.DET3 DIAS
1    2  Curso 1  Turma A    Manha    5
2    2  Curso 1  Turma A    Tarde    5
3    2  Curso 1  Turma B             5
5    2  Curso 2  Turma A             7
7    2  Curso 3                     10
8    3  Curso 1  Turma A    Manha    6
9    3  Curso 1  Turma A    Tarde    6
10   3  Curso 1  Turma B             6
12   3  Curso 2  Turma A             7
14   3  Curso 3                     13
15   4  Curso 1  Turma A    Manha    5
16   4  Curso 1  Turma A    Tarde    5
17   4  Curso 1  Turma B             5
19   4  Curso 2  Turma A             6
21   4  Curso 3                     10
    
30.11.2017 / 01:26