Build database

2

I need to build a database with 200 rows and 4 columns (with the 4 variables I designed). I devised a study to estimate the presence of snakes in a particular city, I have no data. I thought the presence of snakes was due to these 4 variables: Precipitation, temperature, Snake Presence (1-yes, 0-no), Seasons of the year (1-Spring, 2-Summer, 3-Autumn, 4-Winter ). Can anyone help?

    
asked by anonymous 09.10.2017 / 21:51

4 answers

4

I would solve this problem as follows using R . I will set the number of rows in my database:

n <- 200

I'm going to create n random values for the precipitation, temperature and presence of snakes with the exponential, normal and binomial distributions, respectively:

set.seed(1234) # para que os resultados sejam reproduziveis
precipitacao <- round(rexp(n, rate=2), digits=1)
temperatura  <- round(rnorm(n, mean=25), digits=0)
presenca     <- rbinom(n, size=1, prob=0.2)

Then I'm going to create a vector called estacao , with the defined seasons:

estacao <- rep(c(1, 2, 3, 4), each=n/4)

Finally, all together in a single object, called data:

dados <- data.frame(precipitacao, temperatura, presenca, estacao)
head(dados)
##    precipitacao temperatura presenca estacao
## 1           1.3          25        0       1
## 2           0.1          23        0       1
## 3           0.0          26        0       1
## 4           0.9          26        0       1
## 5           0.2          27        0       1
## 6           0.0          26        0       1
## 7           0.4          26        1       1
## 8           0.1          25        0       1
## 9           0.4          25        1       1
## 10          0.4          27        0       1

Note that I did not assume any kind of dependency between the variables. For example, the generation of numbers indicating the presence or absence of snakes will not have, in this example, a relation to recorded precipitation or temperature. Everything here was generated in a random and independent way. Perhaps the data collected in the future does not exhibit this behavior.

    
10.10.2017 / 03:12
4

In PostgreSQL you can use the random() function to simulate such data without the need for an external language, with SQL only. I imagine it's something easily transportable to other database systems.

First, I created a table with the specifications given in the question:

create table presenca_de_cobras (
  precipitacao numeric(4,1) not null,
  temperatura int not null,
  presenca boolean not null,
  estacao int not null,
  constraint estacoes check (estacao in (1,2,3,4))
);

Each field must have a different data range, such as precipitation in hundreds of millimeters, I imagine, and the presence of snakes in boolean "true" or "false."

The random() function returns a random value of 0 to 1, so multiply by the maximum limit of the value in question. In addition to random() , I used the floor() function to round down the station random value and generate_series() to build a hundred-register table, but I do not need the data it returns. All of this is passed to INSERT that will populate the table:

-- insere os dados
insert into presenca_de_cobras 
select 
  (random()*200)::numeric(4,1) precipitacao,  -- precipitacao de 0-200 milimetros
  (random()*40)::int temperatura,             -- temperatura de 0-40 graus
  (random() > 0.5) presenca,                  -- presenca TRUE ou FALSE
  floor(random()*4+1)::int estacao            -- estação de 1-4
from generate_series(1,100) as s;

Finally, just check the table itself:

select * from presenca_de_cobras limit 10;
| precipitacao | temperatura | presenca | estacao |
|--------------|-------------|----------|---------|
|        137.2 |          16 |    false |       4 |
|        164.5 |          38 |    false |       3 |
|          4.3 |          14 |     true |       3 |
|        134.9 |          38 |    false |       2 |
|          3.9 |          18 |    false |       3 |
|         67.3 |          37 |    false |       1 |
|        140.3 |          34 |     true |       3 |
|         34.2 |          35 |     true |       4 |
|         56.3 |          37 |     true |       1 |
|        171.5 |           5 |     true |       4 |

Follow SQL Fiddle with all the steps: link

    
09.10.2017 / 22:59
2

I believe the database can be created with the following commands:

CREATE DATABASE Estudo;
CREATE TABLE Persons (
id int,
precipitacao varchar(255),
temperatura varchar(255),
estacao varchar(255),
presenca Serpente boolean 
);

That's how popular the table is as you want, I hope I have helped.

    
09.10.2017 / 22:11
-2

You can start with the

CREATE TABLE table_name (
    column1 datatype,
    column2 datatype,
    column3 datatype,
   ....
);
    
09.10.2017 / 22:05