Is there a javascript implementation of cl100k_base tokenizer?

OpenAI's new embeddings API uses the cl100k_base tokenizer. I'm calling it from the NodeJS client but I see no easy way of slicing my strings so they don't exceed the OpenAI limit of 8192 tokens.

This would be trivial if I could first encode the string, slice it to the limit, then decode it and send it to the API.



Comments

Popular posts from this blog

Spring Elasticsearch Operations

Object oriented programming concepts (OOPs)

Spring Boot and Vaadin : Filtering rows in Vaadin Grid