Filter in dplyr with constraint by max variable value of db gapminder

4

I am filtering on df gapminder , generating% void% when I use the df variable:

library(gapminder) # versão 0.2.0

library(dplyr)  # versão 0.7.2

gapminder %>%
  filter(year == 2007, gdpPercap==max(gdpPercap)) 

# A tibble: 0 x 6

# ... with 6 variables: country <fctr>, continent <fctr>, year <int>, lifeExp <dbl>, pop <int>, gdpPercap <dbl>

If I change the query variable, the expected result appears

gapminder %>%
  filter(year == 2007, pop==max(pop)) 

# A tibble: 1 x 6

# country continent  year lifeExp   pop      gdpPercap

#1   China      Asia  2007  72.961 1318683096  4959.115

Would it be a bug of gdpPercap ? I am using RStudio (Version 1.0.143) and MRO (3.3.3).

    
asked by anonymous 09.01.2018 / 12:32

1 answer

3

The result is correct. The

gapminder %>%
  filter(year == 2007, gdpPercap==max(gdpPercap)) 

will return all rows of the data frame gapminder that occurred in year 2007 and whose gdpPercap is equal to the maximum value of gdpPercap . It turns out that no country satisfies this condition. Here's how:

gapminder %>%
  group_by(year) %>% 
  summarise(max(gdpPercap))
# A tibble: 12 x 2
    year 'max(gdpPercap)'
   <int>            <dbl>
 1  1952        108382.35
 2  1957        113523.13
 3  1962         95458.11
 4  1967         80894.88
 5  1972        109347.87
 6  1977         59265.48
 7  1982         33693.18
 8  1987         31540.97
 9  1992         34932.92
10  1997         41283.16
11  2002         44683.98
12  2007         49357.19

The maximum value of gdpPercap occurred in 1957. Thus, it does not make sense to ask which countries got this value in 2007. Note that the command below, with the year 1957, returns a non-empty result:

gapminder %>%
  filter(year == 1957, gdpPercap==max(gdpPercap))
# A tibble: 1 x 6
  country continent  year lifeExp    pop gdpPercap
   <fctr>    <fctr> <int>   <dbl>  <int>     <dbl>
1  Kuwait      Asia  1957  58.033 212846  113523.1

If your goal was to find the country with the highest percentage of% in 2007, you should first group the data according to the year:

gapminder %>%
  group_by(year) %>%
  filter(year==2007, gdpPercap==max(gdpPercap))
# A tibble: 1 x 6
# Groups:   year [1]
  country continent  year lifeExp     pop gdpPercap
   <fctr>    <fctr> <int>   <dbl>   <int>     <dbl>
1  Norway    Europe  2007  80.196 4627926  49357.19
    
09.01.2018 / 17:57