neuron compiling bert model for inferentia on tf2
This link https://awsdocs-neuron.readthedocs-hosted.com/en/latest/frameworks/tensorflow/tensorflow-neuron/tutorials/bert_demo/bert_demo.html mentions how to compile using tensorflow 1. Can anyone let me know the steps to neuron compile a BERT large model for running inference on inferentia using tensorflow v2??
Thanks in advance Ajay
P.S This is what my log looks like while compiling on tf1 INFO:tensorflow:fusing subgraph {subgraph neuron_op_e76ab3d9bc74f09f with input tensors ["<tf.Tensor 'bert/encoder/ones0/_0:0' shape=(1, 512, 1) dtype=float32>", "<tf.Tensor 'bert/encoder/Cast0/_1:0' shape=(1, 1, 512) dtype=float32>", "<tf.Tensor 'bert/embeddings/LayerNorm/batchnorm/add_10/_2:0' shape=(1, 512, 1024) dtype=float32>"], output tensors ["<tf.Tensor 'bert/pooler/dense/Tanh:0' shape=(1, 1024) dtype=float32>", "<tf.Tensor 'bert/encoder/layer_23/output/LayerNorm/batchnorm/add_1:0' shape=(1, 512, 1024) dtype=float32>"]} with neuron-cc . Compiler status ERROR WARNING:tensorflow:11/03/2022 04:28:48 AM ERROR 9932 [neuron-cc]: Failed to parse model /tmp/tmpbyvnmr6h/neuron_op_e76ab3d9bc74f09f/graph_def.pb: The following operators are not implemented: {'Einsum'} (NotImplementedError)
INFO:tensorflow:Number of operations in TensorFlow session: 7427 INFO:tensorflow:Number of operations after tf.neuron optimizations: 2901 INFO:tensorflow:Number of operations placed on Neuron runtime: 0
WARNING:tensorflow:Converted /home/ubuntu/bert_repo/patent_model/ to ./bert-saved-model-neuron_tf1.15 but no operator will be running on AWS machine learning accelerators. This is probably not what you want. Please refer to https://github.com/aws/aws-neuron-sdk for current limitations of the AWS Neuron SDK. We are actively improving (and hiring)! {'OnNeuronRatio': 0.0}
---I assume the OnNeuronRatio being 0 means that I wont be able to make use of Inferentia hardware acceleration. Is that correct?
Comments
Post a Comment