Managed Inference
Set up and interact with CosmicAC inference services for chat completions. Use these commands to configure your API credentials, select a model, and send prompts directly from the terminal.
Command
cosmicac inference <subcommand> [options]Subcommands
| Subcommand | Description |
|---|---|
| init | Set up API key and default model. |
| list-models | Display all available models. |
| chat | Send chat completion requests. |
inference init
Sets up your API key and default model for inference operations with an interactive setup.
Usage
cosmicac inference initExample
$ cosmicac inference init
? Select protocol: (Use arrow keys)
❯ HTTP (default)
HRPC (P2P)
? Enter your API key: ttr-proj-xxxxxxxxxxxx
Validating API key...
Project ID: proj_abc123
Key prefix: ttr-proj-xxx
? Select default model:
❯ TinyLlama/TinyLlama-1.1B-Chat-v1.0
meta-llama/Llama-2-7b-chat
API key and default model saved successfully.inference list-models
Displays all available models for inference operations.
Usage
cosmicac inference list-modelsExample
$ cosmicac inference list-models
Available models:
──────────────────────────────────────────────────
- TinyLlama/TinyLlama-1.1B-Chat-v1.0
ID: TinyLlama/TinyLlama-1.1B-Chat-v1.0 (default)
Context length: 2,048 tokens
──────────────────────────────────────────────────
Total models: 1
Default model: TinyLlama/TinyLlama-1.1B-Chat-v1.0Note: You must run cosmicac inference init first, to set up your API key.
inference chat
Send chat completion requests to the configured inference endpoint.
Usage
cosmicac inference chat [options]Options
| Option | Description |
|---|---|
--message | Message to send (required in non-interactive mode). |
--model | Model to use (optional if default is set). |
--api-key | API key for authentication. |
--max-tokens | Maximum tokens to generate (default: 1000). |
--temperature | Temperature for sampling (default: 1.0). |
--stream | Enable streaming response (default: false). |
--interactive | Enable interactive chat mode (default: false) |
Examples
# Basic chat completion
$ cosmicac inference chat --message "Explain quantum computing"
# Interactive chat mode with streaming
$ cosmicac inference chat --interactive --stream
# Chat with streaming response
$ cosmicac inference chat --message "Say Hello" --stream
# Chat with specific model and parameters
$ cosmicac inference chat --message "Write a Python function" --model "TinyLlama/TinyLlama-1.1B-Chat-v1.0" --max-tokens 500 --temperature 0.7
# Using API key from the command line
$ cosmicac inference chat --api-key "ttr-proj-xxxx" --message "Hello"API Key Resolution
The API key is resolved in the following order:
--api-keyflagCOSMICAC_API_KEYenvironment variable- Stored key from
inference initcommand
Model Resolution
--modelflag- Default model from
inference initcommand
Note: If no model is provided and no default is set, you will be prompted to run cosmicac inference init.