Is there a javascript implementation of cl100k_base tokenizer?

OpenAI's new embeddings API uses the cl100k_base tokenizer. I'm calling it from the NodeJS client but I see no easy way of slicing my strings so they don't exceed the OpenAI limit of 8192 tokens.

This would be trivial if I could first encode the string, slice it to the limit, then decode it and send it to the API.



Comments

Popular posts from this blog

Spring Elasticsearch Operations

Today Walkin 14th-Sept

Object oriented programming concepts (OOPs)