使用Bert预训练模型文本分类

Summary: Author: 张亚飞 | 阅读时间: 2 minute read | Published: 2020-01-09
Filed under Categories: LinuxTags: Note,

Bert 模型

Bert模型的全称是Bidirectional Encoder Representations from Transformers,是通过训练Masked Language Model和预测下一句任务得到的模型.

准备工作

1.下载bert

git clone https://github.com/google-research/bert.git

2.下载bert预训练模型

Google提供了多种预训练好的bert模型,有针对不同语言的和不同模型大小的. 对于中文模型,我们使用Bert-Base, Chinese. 为了下载该模型,可能需要使用梯子.如果需要下载其他的模型(英文以及其他语言),可以在 Bert 里的Pre-trained models找到下载链接.


训练模型

processed 192251 tokens with 6932 phrases; found: 7319 phrases; correct: 6450.
accuracy:  99.12%; precision:  88.13%; recall:  93.05%; FB1:  90.52
              LOC: precision:  90.87%; recall:  94.04%; FB1:  92.43  3318
              ORG: precision:  78.46%; recall:  88.56%; FB1:  83.20  2298
              PER: precision:  95.83%; recall:  96.57%; FB1:  96.20  1703

验证结果

going to restore checkpoint
{1: 'I-ORG', 2: 'B-LOC', 3: 'I-PER', 4: 'I-LOC', 5: '[CLS]', 6: 'X', 7: 'O', 8: '[SEP]', 9: 'B-ORG', 10: 'B-PER'}
请输入需要预测的句子:
武汉新型冠状病毒的起源
[['B-LOC', 'I-LOC', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O']]
LOC, 武汉
PER
ORG
time used: 1.255111 sec

NVIDIA 驱动

[How To] Install Latest NVIDIA Drivers In Linux

docker container run --rm --name=us.tensorflow -u root -p 8806:8806 --gpus all -v /home/core/data/www/Work:/opt/data -it coam/us.tensorflow /bin/bash --login

加参数 --gpus all 启动报错

/usr/bin/docker: Error response from daemon: exec: "nvidia-container-runtime-hook": executable file not found in $PATH.

安装以下软件

sudo apt install nvidia-container-runtime

查看系统驱动信息

core@local-05:~/data$ nvidia-smi
Sun Feb 16 21:48:50 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.44       Driver Version: 440.44       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1070    Off  | 00000000:03:00.0 Off |                  N/A |
| 60%   79C    P2   143W / 160W |   7759MiB /  8118MiB |     98%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      9790      C   /usr/bin/python                             7745MiB |
+-----------------------------------------------------------------------------+

Comments

Cor-Ethan, the beverage → www.iirii.com