Managed Inference

Set up and interact with CosmicAC inference services for chat completions. Use these commands to configure your API credentials, select a model, and send prompts directly from the terminal.

Command

cosmicac inference <subcommand> [options]

Subcommands

Subcommand	Description
init	Set up API key and default model.
list-models	Display all available models.
chat	Send chat completion requests.

`inference init`

Sets up your API key and default model for inference operations with an interactive setup.

Usage

cosmicac inference init

Example

$ cosmicac inference init

? Select protocol: (Use arrow keys)
❯ HTTP (default)
  HRPC (P2P)

? Enter your API key: ttr-proj-xxxxxxxxxxxx
Validating API key...
Project ID: proj_abc123
Key prefix: ttr-proj-xxx

? Select default model:
❯ TinyLlama/TinyLlama-1.1B-Chat-v1.0
  meta-llama/Llama-2-7b-chat

API key and default model saved successfully.

`inference list-models`

Displays all available models for inference operations.

Usage

cosmicac inference list-models

Example

$ cosmicac inference list-models

Available models:
──────────────────────────────────────────────────
- TinyLlama/TinyLlama-1.1B-Chat-v1.0
  ID: TinyLlama/TinyLlama-1.1B-Chat-v1.0 (default)
  Context length: 2,048 tokens

──────────────────────────────────────────────────
Total models: 1

Default model: TinyLlama/TinyLlama-1.1B-Chat-v1.0

Note: You must run cosmicac inference init first, to set up your API key.

`inference chat`

Send chat completion requests to the configured inference endpoint.

Usage

cosmicac inference chat [options]

Options

Option	Description
`--message`	Message to send (required in non-interactive mode).
`--model`	Model to use (optional if default is set).
`--api-key`	API key for authentication.
`--max-tokens`	Maximum tokens to generate (default: 1000).
`--temperature`	Temperature for sampling (default: 1.0).
`--stream`	Enable streaming response (default: false).
`--interactive`	Enable interactive chat mode (default: false)

Examples

# Basic chat completion
$ cosmicac inference chat --message "Explain quantum computing"

# Interactive chat mode with streaming
$ cosmicac inference chat --interactive --stream

# Chat with streaming response
$ cosmicac inference chat --message "Say Hello" --stream

# Chat with specific model and parameters
$ cosmicac inference chat --message "Write a Python function" --model "TinyLlama/TinyLlama-1.1B-Chat-v1.0" --max-tokens 500 --temperature 0.7

# Using API key from the command line
$ cosmicac inference chat --api-key "ttr-proj-xxxx" --message "Hello"

API Key Resolution

The API key is resolved in the following order:

--api-key flag
COSMICAC_API_KEY environment variable
Stored key from inference init command

Model Resolution

--model flag
Default model from inference init command

Note: If no model is provided and no default is set, you will be prompted to run cosmicac inference init.

Managed Inference

On this page