I am studying empirical functions in R. In order to achieve a better understanding of the numerical method, I have created my own approximation in R through the following function:
funcao_empirica <- function(x, n=length(x)){
##organizamos os valores de x
x <- sort(x)
##garantimos que não hajam valores repetidos
unicos <- unique(x)
##buscamos a imagem dos valores de x
y <- cumsum(rep(1,n))/n
##obtemos as coordenadas de x e y
coord <- xy.coords(x,y,setLab = F)
x <- coord$x
y <- coord$y
lista <- list(x,y)
names(lista) <- c("x","y")
return(lista)
}
I define x as a rnorm(100)
, I create a x
vector with 100 values of the standard normal.
Because R provides the ecdf
function to get the function approximation, I create an empirical function called F:
F = ecdf(rnorm(length(dados$dominio)))
I transform the data of my function into a dataframe
to use them in ggplot:
x <- dados$x
y <- dados$y
dados <- data.frame(dominio = x, imagem = y)
Now, I plot a line chart for my method, a line chart for the ecdf
method, and finally a scatter plot for the data of rnorm
:
ggplot(dados, aes(x=x, y=y)) + geom_step(color = 'blue') + stat_function(fun = F, colour = "red") +
geom_point(stat = "function", fun = rnorm)
Getting the image
As you can see, the scatter data is very scattered, deformatting the image a little, I suspect it's because I'm not setting parameter 100 inside the function.
How do I set the parameter of rnorm
equal to 100?
Or, if it is not possible, limit the region of the plot between 0 and 1.