Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
  1. Using the API for language model inference is billed using credits. Tokens generated by a given model are converted into credits, with the cost per token depending on the model size.

  2. Access to the hosted API for language models can be requested in the PLGRID portal under the services tab. Search for and activate the service Access to Large Language Models at the Cyfronet centre. The service can only be activated for users with an appropriate grant that includes a credit allocation.

  3. Once access to the service is granted, visit llmlab.plgrid.pl and log in using your PLGRID portal credentials.

  4. After logging in, the main panel will display:

    1. A list of active grants, along with the available credits and their usage status.
    2. A section for generating API keys for a specific grant.
    3. A section for checking currently available models.
  5. To generate an API key, select a grant to bill the requests from the dropdown menu in the section “Generate API Key”. After selecting the grant, click the “Generate” button. The API key for authenticating model queries will appear on the screen. The key is valid for 30 days and cannot be viewed again.

  6. To check available models, click the “Check” button in the “Check available models” section. A list of currently active models will be displayed. These model names are required when specifying which model should respond to a query.

  7. To check available models via the API, send a GET request to the URL: https://llmlab.plgrid.pl/api/v1/models. The request has to be structured as follows (remember to replace the <API-key> field with generated token):

    Code Block
    languagebash
    curl -X 'GET' \
      'https://llmlab.plgrid.pl/api/v1/models' \
      -H 'accept: application/json' \
      -H 'Authorization: Bearer <API-key>'


  8. The API is compatible with the OpenAI API. To send a query to a model, use a POST request to the URL https://llmlab.plgrid.pl/api/v1/completions with the following structure (remember to replace the <API-key> field with generated token and <model-name> field with the chosen model name):

    Code Block
    languagebash
     curl -X 'POST' \
    	'https://llmlab.plgrid.pl/api/v1/completions' \
    	-H 'accept: application/json' \
    	-H 'Authorization: Bearer <API-key>' \
    	-H 'Content-Type: application/json' \
    	-d '{
    		"model": "<model->",
            "messages": [
                {
                  "role": "user",
                  "content": "Hi, how are you?"
                }
              ],
            "max_tokens": 100,
            "top_p": 1,
            "temperature": 1,
            "presence_penalty": 0,
            "frequency_penalty": 0,
            "stream": false
    		}'

     The model will respond in the format specified by OpenAI API. Details of the request parameters are available in OpenAI API docs

  9. API docs compatibile with OpenAPI standard are available under llmlab.plgrid.pl/api/docs.

  10. Cost of requests for a given model in credits per token:
    1. speakleash/Bielik-11B-v2.3-Instruct: 0.000001