Image Processing? Doubt about the theme and usability

1

Talk, how are you? I have a question if it's the same name as the Image Processing I do not know if you have a look around there are some applications that bring such a proposal.

There is a translation of words, which when placing the camera of the phone in some text it translates to the language that you choose.

My doubt is. What is this "magic" called, what language is used for it? Is there any library that can help with development? I even thought there would be an I.A to do this processing and such, detect objects, text these things. Can you give me a light on this? Thank you.

    
asked by anonymous 05.09.2018 / 01:17

2 answers

1

It is a new field that has been gaining space called Augmented Reality :

  

Azuma defines augmented reality as a system that:

     
  • combines virtual elements with the real environment;
  •   
  • is interactive and has real-time processing;
  •   
  • is designed in three dimensions.
  •   
With the great power of processing and memory growing exponentially in mobile devices iterations between real and virtual environment by camera / audio is a reality, vide Pokemon Go , of course it is a game using the processing power of your smart phone to build in real-time combinations of real elements (images of your camera) with virtual elements ("pets" that seem to be in front of you when you look at the images generated by the camera).

Augmented Reality is a concept, the language to do this can be any, of course in the case of Pokemon Go, and applications for Android phones, most are written in java combining C/C++ codes with JNI (Java Native Interface).

The example you gave about the APP that can translate words and phrases in real time pointing the phone has some steps, it seems magical, but it is not, if you dedicate yourself you can build a prototype of this type of algorithm using your smart phone, note the recipe:

  • You need to learn about image processing

  • Linear algebra (basic calculation on dimensions x, y, z)

  • Learn language C to write code that identifies OCR (Optical Character Recognition), simmmm of course you will need segment each word / phrase / letter, you will only achieve this developing an OCR that has the function of extracting the letters from an image / camera and returning in text mode, this is a complex part of the code, it is usually written in C to earn performance, you can use a library called OpenCV , it has many functions ready to work with image processing, want to know in detail how to build a OCR ? the steps are in this my answer or here , if you need something more practical I wrote a code with some concept on how to separate each letter using python here

    / li>

  • Learn Java language (android)

  • Learn how to integrate C with Java using JNI

  • Train a large database using the entire alphabet with different font types (do you want your algorithm to be robust? want it to be able to read and identify the most sources huh? Arial / italic / times new roman / etc, etc, etc.)

  • Dictionary to translate every word converted into text (ie you will basically need a word / sentence translator)

  • After you have captured the text and translated it, you will now need replace the original text with the new text, you know what the coordinates in the plane that your camera is capturing, certainly you must have saved the x, y, z positions at the time you targeted the word / phrase using OCR now you just have to superimpose the phrase original by the translated phrase ... Ready lol

Of course it's complex, but the steps are there ...

    
07.09.2018 / 03:50
0

Complementing the ederwander response, I think what you're looking for is something related to OCR, and a widely used tool is tesseract

The question of training will not always be a problem because there are already good trainings available online for en, as well as language, so be calm as there are wrappers for several, such as C # and python for example.

    
07.09.2018 / 04:50