Using the API for language model inference is billed using credits. Tokens generated by a given model are converted into credits, with the cost per token depending on the model size.
Access to the hosted API for language models can be requested in the PLGRID portal under the services tab. Search for and activate the service Access to Large Language Models at the Cyfronet centre. The service can only be activated for users with an appropriate grant that includes a credit allocation.
Once access to the service is granted, visit llmlab.plgrid.pl and log in using your PLGRID portal credentials.
After logging in, the main panel will display:
To generate an API key, select a grant to bill the requests from the dropdown menu in the section “Generate API Key”. After selecting the grant, click the “Generate” button. The API key for authenticating model queries will appear on the screen. The key is valid for 30 days and cannot be viewed again.
To check available models, click the “Check” button in the “Check available models” section. A list of currently active models will be displayed. These model names are required when specifying which model should respond to a query.
To check available models via the API, send a GET request to the URL: https://llmlab.plgrid.pl/api/v1/models. The request has to be structured as follows (remember to replace the <API-key>
field with generated token):
Code Block | ||
---|---|---|
| ||
curl -X 'GET' \ 'https://llmlab.plgrid.pl/api/v1/models' \ -H 'accept: application/json' \ -H 'Authorization: Bearer <API-key>' |
The API is compatible with the OpenAI API. To send a query to a model, use a POST request to the URL https://llmlab.plgrid.pl/api/v1/completions with the following structure (remember to replace the <API-key>
field with generated token and <model-name>
field with the chosen model name):
Code Block | ||
---|---|---|
| ||
curl -X 'POST' \ 'https://llmlab.plgrid.pl/api/v1/completions' \ -H 'accept: application/json' \ -H 'Authorization: Bearer <API-key>' \ -H 'Content-Type: application/json' \ -d '{ "model": "<model->name>", "messages": [ { "role": "user", "content": "Hi, how are you?" } ], "max_tokens": 100, "top_p": 1, "temperature": 1, "presence_penalty": 0, "frequency_penalty": 0, "stream": false }' |
The model will respond in the format specified by OpenAI API. Details of the request parameters are available in OpenAI API docs
API docs compatibile with OpenAPI standard are available under llmlab.plgrid.pl/api/docs.