Pattern recognition [closed]

Question

Pattern recognition [closed]

Navigation

#1 by (10 votes)

9

I have hundreds of digital images of dogs and cats, I need to make an algorithm to recognize when it is the dog and when it is the cat. What steps should I take?

python r matlab

asked by anonymous 08.07.2016 / 15:38

1 answer

What is the "application / ld + json" type used in a script tag? How does ES7 async / await work?

score 10 · Answer 1

First of all, it's cool to say that this is a famous machine-learning problem. It is available as Kaggle Challenge , from which you can also download the database. In fact, that's where I downloaded the data to write the answer.

I will show a very simple methodology for training a classifier for this problem. The answer is quite a hello world of this world, but it might help. This article describes a much more advanced methodology for forecasting (accuracy is 82% of the images)

Note also that this is a R solution for this problem.

Read the images for the software

In R you can read the images using the imager package.

library(imager)
library(dplyr)
library(tidyr)
library(stringr)
img <- imager::load.image("train/cat.0.jpg")

At first, I'll leave the image smaller and standardized. 100 x 100. This is to stay. is not a mandatory step, although it is recommended. I will also consider grayscale and non-colored images to further reduce them.

img <- imager::grayscale(img)
img <- imager::resize(img, 100, 100)

Now, we have a 100 x 100 matrix with each element representing the gray tone.

I'd rather represent the image as a data.frame in the R, because it's easier to manipulate. So I use the following code.

img_df <- as.matrix(img) %>% 
  data.frame() %>% 
  mutate(x = 1:nrow(.)) %>% 
  gather(y, t, -x) %>% 
  mutate(y = extract_numeric(y))

Here the image is represented in 3 columns of a data.frame. The first two x e y identifies the position of the pixel. The latter represents the pixel's gray tone.

For entry into a statistical model / machine-learning algorithm it is necessary to obtain a database in which each row is an observation / an individual / a sample unit, and each column is a characteristic observed in that individual.

So to classify images of cats and dogs we need a database in which each image is represented in a row and each pixel of the image is a column (the pixels are the information observed from the image). In addition we will need a column indicating if the image is of a cat or a dog to train the algorithm / estimate its parameters.

To convert the image to a line I use the following command:

img_line <- img_df %>%
  mutate(colname = sprintf("x%03dy%03d", x, y)) %>%
  select(-x, -y) %>%
  spread(colname, t)

If you wanted to consider the color of the image in your model, at this step you would need to create a column for each pixel and each color, ie 3x100x100 = 30,000, you would end up with each image represented by a line of 30,000 columns.

Processing a series of images.

I explained how you would do to process an image, but to train the algorithm several images are required. I will encapsulate the previous code in a function and use it to process a series of images.

processar <- function(path){
  img <- imager::load.image("train/cat.0.jpg")
  img <- imager::grayscale(img)
  img <- imager::resize(img, 100, 100)
  img_df <- as.matrix(img) %>% 
    data.frame() %>% 
    mutate(x = 1:nrow(.)) %>% 
    gather(y, t, -x) %>% 
    mutate(y = extract_numeric(y))
  img_line <- img_df %>%
    mutate(colname = sprintf("x%03dy%03d", x, y)) %>%
    select(-x, -y) %>%
    spread(colname, t)
  return(img_line)
}

For demonstration purposes, I'll get a sample of 100 dog images and 100 cat images for model training. In practice, many more images are needed.

arqs <- list.files("train", full.names = T)
amostra_gato <- arqs[str_detect(arqs, "cat")] %>% sample(100)
amostra_cachorro <- arqs[str_detect(arqs, "dog")] %>% sample(100)
amostra <- c(amostra_gato, amostra_cachorro)

bd <- plyr::ldply(amostra, processar)
Y <- as.factor(rep(c("gato", "cachorro"), each = 100) ) # vetor de respostas

This step takes a long time and is computationally intense. You do a lot of processing and images are heavy files.

Modeling

Here any machine-learning algorithm could be used. You have already transformed your images in a conventional database. Already notice, this usually takes quite a while. On my computer to train with 200 images of 10,000 columns took about 30 min.

I'm going to use random forest to do the sorting, but you could actually model any.

m <- randomForest::randomForest(bd, Y, ntree = 100)

I will not go into details of how modeling should be done. The right thing is for you to separate a training base and a test base. Verify that you have not overfitting, tuning parameters using cross-validation, etc. But that would make the answer very extensive, so I trained a random forest using all the patterns of the R function (changing only the number of trees).

I checked the error only on the build basis as well (which is wrong statistically, but ball forward).

tabela <- table(predict(m, type = "class"), Y)
acerto <- sum(diag(tabela))/sum(tabela)
acerto

Forecast for the initial image

With the trained model and a new processed image, use the following command to predict the category:

predict(m, newdata = img_line)