2021-10-22 15:32 已编辑北京工业大学算法工程师

关注

PaddleOCR训练自己的文字识别模型（数码管AI读数）

PaddlePaddle建议版本：2.0.2

优质博客：

alt alt alt 上述训练数据结构安排行不通的话，如下设置：

配置文件： alt

train.txt中的内容： alt

alt alt alt alt

alt

alt alt

  ...
  # Add a custom dictionary, such as modify the dictionary, please point the path to the new dictionary
  character_dict_path: ppocr/utils/ppocr_keys_v1.txt
  # Modify character type
  character_type: ch
  ...
  # Whether to recognize spaces
  use_space_char: True


Optimizer:
  ...
  # Add learning rate decay strategy
  lr:
    name: Cosine
    learning_rate: 0.001
  ...

...

Train:
  dataset:
    # Type of dataset，we support LMDBDataSet and SimpleDataSet
    name: SimpleDataSet
    # Path of dataset
    data_dir: ./train_data/
    # Path of train list
    label_file_list: ["./train_data/train_list.txt"]
    transforms:
      ...
      - RecResizeImg:
          # Modify image_shape to fit long text
          image_shape: [3, 32, 320]
      ...
  loader:
    ...
    # Train batch_size for Single card
    batch_size_per_card: 256
    ...

Eval:
  dataset:
    # Type of dataset，we support LMDBDataSet and SimpleDataSet
    name: SimpleDataSet
    # Path of dataset
    data_dir: ./train_data
    # Path of eval list
    label_file_list: ["./train_data/val_list.txt"]
    transforms:
      ...
      - RecResizeImg:
          # Modify image_shape to fit long text
          image_shape: [3, 32, 320]
      ...
  loader:
    # Eval batch_size for Single card
    batch_size_per_card: 256
    ...

注意预测/评估的配置文件必须与训练一致。

3 评价

可以通过修改文件中的Eval.dataset.label_file_list字段来设置评估数据集configs/rec/rec_icdar15_train.yml。

python3 -m paddle.distributed.launch --gpus '0' tools/eval.py -c configs/rec/rec_icdar15_train.yml -o Global.checkpoints={path/to/weights}/best_accuracy

4 预测

4.1 训练引擎预测

使用paddleocr训练的模型，可以通过以下脚本快速得到预测。

默认预测图片存储在中infer_img，权重通过指定-o Global.checkpoints：

python3 tools/infer_rec.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml -o Global.pretrained_model={path/to/weights}/best_accuracy Global.load_static_weights=false Global.infer_img=doc/imgs_words/en/word_1.jpg

获取输入图像的预测结果：

infer_img: doc/imgs_words/en/word_1.png
        result: ('joint', 0.9998967)

用于预测的配置文件必须与训练一致。比如你用完成了中文模型的训练python3 tools/train.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml，可以使用如下命令来预测中文模型：

python3 tools/infer_rec.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml -o Global.pretrained_model={path/to/weights}/best_accuracy Global.load_static_weights=false Global.infer_img=doc/imgs_words/ch/word_1.jpg

github链接

全部评论

推荐最新楼层

06-25 15:22

上海大学硬件开发

好的mentor都什么样的啊？

鼠鼠第一次实习，啥也不懂一直是自己一个人吃的饭，不会做工作老是被嫌弃，大人的世界是这样的吗？

我是星星我会发亮：好的mt有两种，一种愿意教你的，一种几乎什么活都不给你派让你很闲允许你做自己事情的

实习吐槽大会

点赞评论收藏

06-25 15:37

辽宁大学模拟IC设计

和mentor关系搞的有点僵了，怎么办啊

虽然我自认为已经很尽心尽力的工作了，每天不迟到不早退，主动加班不摸鱼，但可能还是能力不够习惯不好的，还是没能达到她的要求。她已经多次跟我直接表达过对我的不满了，说我是实习生里最差的看到别的组里的实习生都跟带教处成朋友，而我问她个问题都心惊胆战畏畏缩缩，真的很不好受...但我还差三个星期才能拿到实习证明，她可能也知道现在再招实习生比较难，一直说想劝退我但到现在还没让我走。我现在每天上班真的身心都很痛苦，不知道要不要再坚持下去了。

ddzd：不转正你就是👴，实习证明我感觉没什么用，有没有实习经历问你两句就问出来了

实习吐槽大会

点赞评论收藏

06-17 00:26

门头沟学院 Java

求助！！！

前途渺茫啊。自学Java少说已经有半年多了。目前某软件机会都是已读不回，个别投投简历也是没有后续。这个节骨眼这样也挺正常的。但是三本学历怕是秋招没有啥竞争力，甚至怀疑自己适不适合开发（想转测开去了），已经差不多一个月没写代码了，一直在背八股。继续沉淀还是转行测开？友友们    

程序员小白条：建议换下项目，智能 AI 旅游推荐平台：https://github.com/luoye6/vue3_tourism_frontend 智能 AI 校园二手交易平台：https://github.com/luoye6/vue3_trade_frontend GPT 智能图书馆：https://github.com/luoye6/Vue_BookManageSystem 选项目要选自己能掌握的，然后最好能自己拓展的，分布式这种尽量别去写，不然你只能背八股文了，另外实习的话要多投，尤其是学历不利的情况下，多找几段实习，最好公司title大一点的

无实习如何秋招上岸

点赞评论收藏