How to track letters and cut them?

12

I have this image below. As you can see, they are together. In this case, it is a "7", "h" and "a". Is it possible to crawl the image to find them and then cut them where exactly you found them?

I'm using the OpenCV, OCR, and Tesseract library in Java. I've already looked for something like findcontours, convexHull ... but I did not get the result I want.

    
asked by anonymous 07.08.2017 / 10:49

1 answer

1

This problem is typical of an OCR. Known as character targeting ( char segmentation algorithms , if you search on google), this problem has been studied since the 1950s.

In general, the algorithm that both tesseract and ocropus use (the 2 most well-known OCRs opensources) does the following:

  • Find the connected components - Scan the image and find all pixels that have neighbors. So characters that do not stick together can be easily separated.

  • For each connected component, calculate a box that fits all component pixels (bounding box).

  • Only, as you mentioned, this algorithm fails in 2 cases. When the characters are stuck and when a character breaks in 2.

    To solve this problem, ocrs use a technique called Seam Carving . In general terms, it is an algorithm that calculates a path that spends less "energy", or goes through fewer obstacles. In the case of character segmentation, a path with fewer obstacles is a vertical path that least touches the character's boundary.

    For example, let's say we have the following image:

    Thepathswiththefewestobstaclesarethefollowing:

    Inthisimage,thepossiblecutsareblackinthesentenceandatthebottomagraphshowingthenumberofobstaclescalculatedbythealgorithmforeachcolumnofpixels.

    Theresultofthisalgorithmisnotthecutsitself,butthecuttingpossibilities.Somearerightandsomearenot.ThenextstepinOCRistoresolvethisproblem.

    Inthisstep,OCRgeneratesseveralclippingpossibilities,joiningorseparatingthesegmentsfound,asintheimagebelow:

    ThisgeneratesdozensofpossibilitiesforOCR.ButhowdoesOCRknowthecorrectclipping?Simple,itthrowseverypossibilityintoacharacterrecognizer,whichwillreturnadegreeofconfidencethatthatcutisaletter.

    TheOCRthentakesallthecutoutsinwhichthesumofthedegreesofconfidenceisaslargeaspossible.

    PS:I'mnotgoingtodetaileveryalgorithmhere,becauseitwouldgiveyoua5,000wordpost,butyoucanstudyallthosealgorithmsimplementedin OCRopus .

        
    10.07.2018 / 00:39