Is there a javascript implementation of cl100k_base tokenizer?
OpenAI's new embeddings API uses the cl100k_base
tokenizer. I'm calling it from the NodeJS client but I see no easy way of slicing my strings so they don't exceed the OpenAI limit of 8192 tokens.
This would be trivial if I could first encode the string, slice it to the limit, then decode it and send it to the API.
Comments
Post a Comment