.NET OCR SDK

OCR Languages & Trained Data
CnetSDK .NET OCR SDK is an accurate and mature Tesseract OCR software that analyze & recognize text languages, and extract text from your source image in a blink. This .NET OCR library supports more than 60 languages, including English, CJK (Chinese, Japanese and Korean), Spanish, German, French, Italian, Russian, Latin, Hindi, Greek, Turkish, Dutch, Portuguese, Thai, and more.
.NET TESSERACT OCR SOFTWARE LANGUAGE SETTING
In the first section of this page, you will see an example of using CnetSDK .NET OCR SDK in C# programming. It illustrates how to use our .NET Tesseract OCR software and set multiple OCR languages, considering you will need to OCR image text that has more than one languages. Please note that the trained data for all supportive OCR languages can be downloaded from the second part of this guide page.

Things You Should Know

1. The first character of extracted text from image will be recognized as "CnetSDK*" if you are using CnetSDK .NET OCR SDK free trial package. To get full Tesseract OCR software for your .NET application development, please make an order for its license.

2. We provide .NET OCR library dlls for both x86 and x64 platforms. Please use the suitable .NET OCR library. If you use x64 platform, please copy "CnetSDKOCR_Lept.dll" and "CnetSDKOCR_Tesseract.dll" from x64 folder to the same path of "CnetSDK.OCR.Trial.dll". For x86 platform, do it in the same way.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using CnetSDK.OCR.Trial;

// Create an OCR Engine instance.
OcrEngine OCRLanguages = new OcrEngine();

// Set the absolute path of tessdata.
OCRLanguages.TessDataPath = "F:/tessdata/";

// Set the target text language you want to recognize.
OCRLanguages.TextLanguage = "eng+fra";
         
// Read and recognize text from PNG image.
string Imagetext = OCRLanguages.PerformOCR("F:/CnetSDK.png");

System.Console.WriteLine(Imagetext);
.NET TESSERACT OCR SOFTWARE TRAINED DATA
In the above C# coding example, we only take English and French OCR as an example. Besides these two languages, more than 60 OCR languages are also supported by CnetSDK .NET OCR software, such as Spanish, German, French, Italian, Russian, Latin, Hindi, Greek, Turkish, Dutch, Portuguese, Thai, CJK (Chinese, Japanese and Korean), and so on. Please download the trained data of your target language(s) from the following part.
​​
    Language​​
  • Afrikaans
  • Albanian
  • Arabic
  • Azerbaijani
  • Basque
  • Belarusian
  • Bengali
  • Bulgarian
  • Catalan; Valencian
  • Cherokee
  • Chinese - Simplified
  • Chinese - Traditional
  • Croatian
  • Czech
  • Danish
  • Dutch; Flemish
  • English
  • English, Middle (1100-1500)
  • Esperanto
  • Estonian
  • Finnish
  • Frankish
  • French
  • French, Middle (ca. 1400-1600)
  • Galician
  • German
  • Greek, Ancient (-1453)
  • Greek, Modern (1453-)
  • Hebrew
  • Hindi
  • Hungarian
  • Icelandic
  • Indonesian
  • Italian
  • Italian - Old
  • Japanese
  • Kannada
  • Korean
  • Latvian
  • Lithuanian
  • Macedonian
  • Malay
  • Malayalam
  • Maltese
  • Norwegian
  • Polish
  • Portuguese
  • Romanian; Moldavian; Moldovan
  • Russian
  • Serbian
  • Slovak
  • Slovenian
  • Spanish; Castilian
  • Spanish; Castilian - Old
  • Swahili
  • Swedish
  • Tagalog
  • Tamil
  • Telugu
  • Thai
  • Turkish
  • Ukrainian
  • Vietnamese
​​
    Program Code
  • afr
  • sqi
  • ara
  • aze
  • eus
  • bel
  • ben
  • bul
  • cat
  • chr
  • chi_sim
  • chi_tra
  • hrv
  • ces
  • dan
  • nld
  • eng
  • enm
  • epo
  • est
  • fin
  • frk
  • fra
  • frm
  • glg
  • deu
  • grc
  • ell
  • heb
  • hin
  • hun
  • isl
  • ind
  • ita
  • ita_old
  • jpn
  • kan
  • kor
  • lav
  • lit
  • mkd
  • msa
  • mal
  • mlt
  • nor
  • pol
  • por
  • ron
  • rus
  • srp
  • slk
  • slv
  • spa
  • spa_old
  • swa
  • swe
  • tgl
  • tam
  • tel
  • tha
  • tur
  • ukr
  • vie

    Download​​
CnetSDK .NET OCR SDK is a mature Tesseract OCR software. It not only enables C# and VB.NET developers to quickly extract text from single page raster images and multi-page TIFF file, but also provides advanced zonal OCR technology for image text OCR from specific image area/field. You may refer to the following online tutorials to see details.