CosmicAC Logo

Managed Inference

Set up and interact with CosmicAC inference services for chat completions. Use these commands to configure your API credentials, select a model, and send prompts directly from the terminal.

Command

cosmicac inference <subcommand> [options]

Subcommands

SubcommandDescription
initSet up API key and default model.
list-modelsDisplay all available models.
chatSend chat completion requests.

inference init

Sets up your API key and default model for inference operations with an interactive setup.

Usage

cosmicac inference init

Example

$ cosmicac inference init

? Select protocol: (Use arrow keys)
 HTTP (default)
  HRPC (P2P)

? Enter your API key: ttr-proj-xxxxxxxxxxxx
Validating API key...
Project ID: proj_abc123
Key prefix: ttr-proj-xxx

? Select default model:
 TinyLlama/TinyLlama-1.1B-Chat-v1.0
  meta-llama/Llama-2-7b-chat

API key and default model saved successfully.

inference list-models

Displays all available models for inference operations.

Usage

cosmicac inference list-models

Example

$ cosmicac inference list-models

Available models:
──────────────────────────────────────────────────
- TinyLlama/TinyLlama-1.1B-Chat-v1.0
  ID: TinyLlama/TinyLlama-1.1B-Chat-v1.0 (default)
  Context length: 2,048 tokens

──────────────────────────────────────────────────
Total models: 1

Default model: TinyLlama/TinyLlama-1.1B-Chat-v1.0

Note: You must run cosmicac inference init first, to set up your API key.


inference chat

Send chat completion requests to the configured inference endpoint.

Usage

cosmicac inference chat [options]

Options

OptionDescription
--messageMessage to send (required in non-interactive mode).
--modelModel to use (optional if default is set).
--api-keyAPI key for authentication.
--max-tokensMaximum tokens to generate (default: 1000).
--temperatureTemperature for sampling (default: 1.0).
--streamEnable streaming response (default: false).
--interactiveEnable interactive chat mode (default: false)

Examples

# Basic chat completion
$ cosmicac inference chat --message "Explain quantum computing"

# Interactive chat mode with streaming
$ cosmicac inference chat --interactive --stream

# Chat with streaming response
$ cosmicac inference chat --message "Say Hello" --stream

# Chat with specific model and parameters
$ cosmicac inference chat --message "Write a Python function" --model "TinyLlama/TinyLlama-1.1B-Chat-v1.0" --max-tokens 500 --temperature 0.7

# Using API key from the command line
$ cosmicac inference chat --api-key "ttr-proj-xxxx" --message "Hello"

API Key Resolution

The API key is resolved in the following order:

  1. --api-key flag
  2. COSMICAC_API_KEY environment variable
  3. Stored key from inference init command

Model Resolution

  1. --model flag
  2. Default model from inference init command

Note: If no model is provided and no default is set, you will be prompted to run cosmicac inference init.

On this page