Training data is created using [url=https://github.com/tesseract-ocr/tesseract/blob/master/src/training/tesstrain.sh]tesstrain.shas follows:
src/training/tesstrain.sh --fonts_dir /usr/share/fonts --lang eng --linedata_only \ --noextract_font_properties --langdata_dir ../langdata \ --tessdata_dir ./tessdata --output_dir ~/tesstutorial/engtrain
2.训练文件由.ltsmf代替原来的.tr文件
用tesstrain.sh生成大量的ltsmf文件,作为原始的测试数据