Beyond OCR - Using AI to Understand Complex Technical Drawings

The machine building industry have long sought for technology solutions to automate the data extraction from Technical Drawings. The only option up to this point has been using OCR (Optical Character Recognition). You may have already tried OCR solutions such as Google Vision or Amazon Textract, but soon realized:

 

Generic OCR is not enough to understand Technical Drawings.

OCR-only solution has numerous limitations in understanding complex things Like Technical Drawings. Let’s take a deeper look at how Werk24’s AI algorithms surpassed generic OCR in different challenges and achieved the completely automatic data-extraction from Technical Drawings.

 

Structuring Text Elements

The biggest challenge for machine to read Technical Drawings is to understand the meaning of individual text elements and know when and how to group them into structured data format. OCR can only read out the text but cannot understand the meaning of its own result.

On Technical Drawings, there are many complex data format such as Measure, GD&T and information in Title Blocks. Measure is often presented as a Nominal Size with the Upper and Lower Deviation stacked on top of each other. OCR can only extract text from left to right and is not capable of distinguishing which text is Nominal Size, Upper Deviation or Lower Deviation. And due to complex visual surroundings, OCR also makes numerous mistakes in grouping corresponding elements.

Werk24 developed advanced Machine Learning models and AI algorithms to understand all common formats of Measures with Nominal Size, Tolerance, Fit Size, Threads. By understanding the individual meaning of each element based on its content, context and visual grouping, Werk24’s API can group the right elements into structured data and return as JSON format that can be utilized by machine and feed into your software system directly.

Another Example is the Title Block, where captions (the small text describing what the content is about) such as “Designation”, “Drawing ID”, “Company” is commonly missing. This makes OCR results useless, because the Computer does not understand if the text is Designation, Drawing ID or company details. Werk24 uses AI and ML to understand individual text and pairing the missing captions to the right text results, so that your RFQ or ERP system can directly use such information.

Technical Drawing Title Block Comparison Between Google Vision OCR and Werk24 JSON
 

Context-Aware Correction

OCR can often fail in differentiating numbers or characters that look alike, such as “1”, “7”and “I”, “0” and “O” or “6” and “8”. This makes OCR not a reliable option in processing Technical Drawings in real practice.

Werk24’s technology understands the meaning and context of each text element. Furthermore, it cross --checks Measure Labels and Measure Lines. This means it knows a Nominal Size should be “11” instead of “17” in the situation where it looks very ambiguous and alike.

 

Understanding Special Symbols

Generic OCR solutions cannot read Special Symbols, including all GD&T symbols. And for some mathematical symbols like “Ø”, “±”, the generic OCR has unreliable results in regard to different font.

With its own trained Machine Learning model, Werk24 understands all special symbols in Measures and Tolerances.

 

Complex Graphic Surrounding

Generic OCR cannot reliably detect texts in Drawings that are surrounded by cluttered and intersected graphic elements such as lines, symbols, annotations, etc.

Werk24’s TechRead API reads text elements despite of noises that surround it. Thus, when rotation lines intersect and interfere with Measures, small text fragments can still be read with high accuracy.

 

Multiple Orientation

Many major OCR solutions OCR requires a dominant orientation from the document. As an example, texts in an article always point in one direction, whereas in Technical Drawings there are often text elements in different orientations. This leads to many text elements being missed by OCR such as Amazon Textract.

Werk24 does not assume a dominant orientation, which is of great benefit when extracting data. Instead, the technology can read Measures from each text element individually, whether it is horizontal, vertical, or tilted at an angle


Werk24’s Complete Solution

As the market has searched for sophisticated and reliable technical solution to extract data from Technical Drawings , Werk24 has already met this need with its TechRead API. Available now, we provide the means to automatically obtain important data from Technical Drawings, including Measures, Tolerances, GD&T and Title Blocks, ensuring customers are no longer held back by inadequate OCR solutions. Available now, all important production data in Technical Drawings are accessible in JSON format within several seconds.

Previous
Previous

Werk24 is Mentioned by “Paul Kühn” as a Digitization Solution

Next
Next

Read Title Block from Technical Drawings Intelligently