AWS Free Trial Guide

Hello fellow SillyTavern enthusiast! Hope you're having a good day~

AWS launched a free tier trial subscription recently. They allow new users to have free 200$ credits (100$ at the start and 20$ for each quest you complete, there's 5 quests) for 6 months. You can spend these credits on Amazon Bedrock, which includes Anthropic models. Since there's no direct API to Bedrock, you need a Proxy or a BYOK Gateway. Well, you can use this Rentry as a guide if you want to.

You start with 100$ credits and earn 20$ for each quest you complete. You can complete the quests by creating the instances each quest wants and deleting them right after, so that you won't waste credits just by having them run idle.

First let's set up your AWS account

1. Sign up here.
2. Create a user in the IAM panel. In the permissions panel, click on Attach policies directly and grant the user with the AmazonBedrockFullAccess permission.

3. Go into the user's panel you've created. Create an Access Key and a Secret Access Key from the Security Credentials section. You get both on the same window. Make sure to save the Secret Access Key because you will see it only once after creation.

4. Request access to all models on Bedrock panel.

Whatever endpoint you use, make sure your region on both AWS and the key are us-east-1.

Choose a Gateway or a Proxy

Openrouter (BYOK)

Free to use
Supports prompt caching
Supports all the models
Doesn't store prompts

LiteLLM (Local Proxy)

Free to use
Supports prompt caching
Supports all the models
Doesn't store prompts

Portkey (BYOK)

Free to use
Supports prompt caching
Supports all the models
Stores prompts

Vercel (BYOK)

Free to use
Doesn't support prompt caching
Doesn't support Opus models
Stores prompts

Openrouter

1. Go to Settings → Integrations (BYOK) → Amazon Bedrock to insert your credentials. Tick "Always use this key.".

2. If you're using SillyTavern:

Set your provider to Amazon Bedrock in SillyTavern connection profile panel. Untick "Allow fallback providers" to prevent high costs from other providers if/when AWS fails.

3. If everything is set up correctly, there should be dots around the provider icon at the left side of your request on the Activity panel.

LiteLLM

1. Run these two commands in your CLI
pip install litellm
pip install litellm[proxy]

2. Create a credentials.txt file and add your keys in it. Go into C:\Users\<USER> and create a folder named .aws. Delete the .txt extension from the credentials.txt and put it in the folder.

⎗
✓
[default]
aws_access_key_id = YOUR_ACCESS_KEY
aws_secret_access_key = YOUR_SECRET_ACCESS_KEY
region = us-east-1

3. Create a folder wherever you want your proxy to be. Then create a config.yaml file in there. Add this config in it.

⎗

model_list:
  - model_name: bedrock-claude-sonnet-3.7
    litellm_params:
      model: bedrock/us.anthropic.claude-3-7-sonnet-20250219-v1:0

  - model_name: bedrock-claude-sonnet-4
    litellm_params:
      model: bedrock/us.anthropic.claude-sonnet-4-20250514-v1:0

  - model_name: bedrock-claude-sonnet-4.5
    litellm_params:
      model: bedrock/us.anthropic.claude-sonnet-4-5-20250929-v1:0

  - model_name: bedrock-claude-opus-4
    litellm_params:
      model: bedrock/us.anthropic.claude-opus-4-20250514-v1:0

  - model_name: bedrock-claude-opus-4.1
    litellm_params:
      model: bedrock/us.anthropic.claude-opus-4-1-20250805-v1:0

litellm_settings:
  drop_params: true
  turn_off_message_logging: true

4. Create a run.bat file in the same directory and edit it to add this command. Run the bat file and wait for it to be ready. Change your API to Custom (OpenAI-compatible) in SillyTavern and put in the base url below.

litellm --config config.yaml --port 4000
Custom Endpoint (Base URL): http://localhost:4000/v1

(Optional) Enable Prompt Caching

(Optional) Get access to the built-in dashboard of LiteLLM to track your requests and spendings.

DM me on reddit for this until I finish making a guide.

Portkey

1. Go to ORG MODULES → Integrations → Bedrock through the navigation panel on the left. Enter your Access Key, Secret Access Key and us-east-1 as region.

2. Allow the models you wanna use on the next page (the ones starting with us.anthropic).

3. Create an API key on Portkey.

4. Change your API to Custom (OpenAI-compatible) and put in the base url.

Custom Endpoint (Base URL): https://api.portkey.ai/v1

Here's some model ID's you can put in:

us.anthropic.claude-3-7-sonnet-20250219-v1:0
us.anthropic.claude-sonnet-4-20250514-v1:0
us.anthropic.claude-sonnet-4-5-20250929-v1
us.anthropic.claude-opus-4-20250514-v1:0
us.anthropic.claude-opus-4-1-20250805-v1:0

You need to put the model ID as @<integration-name>/<modelID>. There is an example of it in Getting Started page.

(Optional) Enable Prompt Caching

Vercel

1. Enter a credit card in Dashboard → Settings → Billing so you can access their service (You'll get Error 400 if you don't).

2. Go into Dashboard → API Gateway → Integrations → Amazon Bedrock. Enter your Access Key, Secret Access key and us-east-1 as region.

3. Create an API key on Vercel. Change your API to Custom (OpenAI-compatible) and put in the base url.

Custom Endpoint (Base URL): https://ai-gateway.vercel.sh/v1/

Prompt Caching

(Only for SillyTavern)

What is prompt caching? Why is it so important?

It lowers the cost of your requests by a significant amount, making your free trial credits last up to 5-10x longer.

Prompt caching, which enables developers to cache frequently used context between API calls, is now available on the Anthropic API. With prompt caching, customers can provide Claude with more background knowledge and example outputs—all while reducing costs by up to 90% and latency by up to 85% for long prompts.

It is important not to cache miss, because that would only increase the cost since the cache write is 1.25x/2x more expensive than normal requests. So in order to cache hit successfuly, you should:

Disable lorebooks
Disable randomized prompts
Set your cacheAtDepth above your prompt injections, if there's any. (Tracker, Guided Generations, Author's Note and etc)

To enable prompt caching, you need to edit some values in ~\SillyTavern\config.yaml under claude:

⎗
✓
claude:
  enableSystemPromptCache: true
  cachingAtDepth: 2
  extendedTTL: false

Also, I recommend using Cache Refresh Extension instead of turning on extendedTTL. The cost is cheaper until ~35 minute point. (If you think you need more time, then it's worth to turn on extendedTTL.)

Model	Input	Output	Cache Read	Cache Write (5min)	Cache Write (1hr)
Sonnet	$3 / MTok	$15 / MTok	$0.3$ / MTok	$3.75 / MTok	$6 / MTok
Opus	$15 / MTok	$75 / MTok	$1.5$ / MTok	$18.75 / MTok	$30 / MTok

Manually Enable Prompt Caching

(Needed for LiteLLM and Portkey. You can skip this if you're using Openrouter.)

This change will enable claude prompt caching on all Custom (OpenAI-compatible) endpoints.

Also might break SillyTavern backend completely if steps were followed incorrectly.

Go into ~\SillyTavern\src\endpoints\backends and edit the file named chat-completion.js.

Around L1536 there's the custom endpoint request section:

⎗
✓
} else if (request.body.chat_completion_source === CHAT_COMPLETION_SOURCES.CUSTOM) {
    apiUrl = request.body.custom_url;
    apiKey = readSecret(request.user.directories, SECRET_KEYS.CUSTOM);
    headers = {};
    bodyParams = {
        logprobs: request.body.logprobs,
        top_logprobs: undefined,
    };

    // Adjust logprobs params for Chat Completions API, which expects { top_logprobs: number; logprobs: boolean; }
    if (!isTextCompletion && bodyParams.logprobs > 0) {
        bodyParams.top_logprobs = bodyParams.logprobs;
        bodyParams.logprobs = true;
    }

    mergeObjectWithYaml(bodyParams, request.body.custom_include_body);
    mergeObjectWithYaml(headers, request.body.custom_include_headers);
    }
}

Add the next part inside the else if statement:

⎗
✓
} else if (request.body.chat_completion_source === CHAT_COMPLETION_SOURCES.CUSTOM) {
    .
    .
    .
    // This part below after mergeObjectWithYaml() function calls
    const cachingAtDepth = getConfigValue('claude.cachingAtDepth', -1, 'number');
    const cacheTTL = getConfigValue('claude.extendedTTL', false, 'boolean') ? '1h' : '5m';
    if (Number.isInteger(cachingAtDepth) && cachingAtDepth >= 0) {
        cachingAtDepthForOpenRouterClaude(request.body.messages, cachingAtDepth, cacheTTL);
    }
}

Save the change and restart SillyTavern.
If it was successful, you should see prompt caching usage in LiteLLM console.
Here's an example:
"usage" {"cacheReadInputTokenCount":9224,"cacheReadInputTokens":9224,"cacheWriteInputTokenCount":0,"cacheWriteInputTokens":0,"inputTokens":465,"outputTokens":328,"totalTokens":10017}}

You should turn off streaming in order to inspect the console without all the noise.

Errors and Features

- Internal Server Error 500

Wrong/missing key information. Recreate or reenter your credentials and try again.

- Key validation failed: You don't have access to the model with the specified model ID.

Make sure you have requested access to all the models on Amazon Bedrock.

- Operation not allowed.

Your account is probably on hold. Check your emails from AWS.

- The model returned the following errors: temperature and top_p cannot both be specified for this model. Please use only one.

To use Sonnet 4.5, you have to exclude the top_p parameter because it seems like Bedrock can't handle both temperature and top_p at the same time.

Top Bar → Connection Profile → Additional Parameters → Exclude Body Parameters

Enter these parameters to exclude:

⎗

1
2
3

- frequency_penalty
- presence_penalty 
- top_p

- Enabling 1M Context For Sonnet 4.5

Top Bar → Connection Profile → Additional Parameters → Include Body Parameters

Enter this parameter to include:

⎗

1 2	anthropic_beta: - context-1m-2025-08-07

FAQ

Q: How should I request access to Anthropic models?

You should use your own personal info when you are requesting access to Anthropic's models.

Business name: Your actual name
Business website: Your personal website

You can check their Commercial TOS.

Q: Will I be charged once my credits are depleted or when the subscription ends?

Your account is closed once your subscription expires, so no. Check their FAQ.

Your free plan expires the earlier of (1) 6-months from the date you opened your AWS account, or (2) once you have exhausted your Free Tier credits.

When your free plan expires, AWS closes your account, and you’ll lose access to your resources and data. AWS will retain your data for 90 days after your free plan expires. During this period, you have the option to upgrade to paid plan to reopen your account and restore access to your resources. If you don’t upgrade your account within 90 days, AWS will permanently erase your AWS account and all its content.

Becareful not to automatically upgrade your account to Pro though.

However, your account will automatically upgrade to a paid plan in certain cases, such as when you create or join an AWS Organization, set up an AWS Control Tower landing zone, join AWS Partner Network, create a Professional Services contract, enroll in an Enterprise Agreement with AWS, purchase an AWS Skill Builder Team subscription, or designate your AWS account as HIPAA or SEC compliant.

Q: Does Amazon Bedrock store my prompts?

No, check their docs.

Amazon Bedrock doesn't store or log your prompts and completions. Amazon Bedrock doesn't use your prompts and completions to train any AWS models and doesn't distribute them to third parties.

However, there is an abuse detection system. It mainly talks about image generation, but I don't know if it applies for text generation too.

Q: How long would my trial credits last?

Estimation:

Input tokens: 10,000 per request (average context length).
Output tokens: 300 per request (average response length).
Prompt caching enabled and estimation of 75% success rate (varies with cache misses and cacheAtDepth value)

Model	Without prompt caching	With prompt caching
Sonnet	~5,797 messages	~19,753 messages
Opus	~1,159 messages	~3,950 messages

AWS Free Trial Guide

Table of contents

First let's set up your AWS account

Choose a Gateway or a Proxy

Openrouter

LiteLLM

Portkey

Vercel

Prompt Caching

What is prompt caching? Why is it so important?

Manually Enable Prompt Caching

Errors and Features

- Internal Server Error 500

- Key validation failed: You don't have access to the model with the specified model ID.

- Operation not allowed.

- The model returned the following errors: temperature and top_p cannot both be specified for this model. Please use only one.

- Enabling 1M Context For Sonnet 4.5

FAQ

Q: How should I request access to Anthropic models?

Q: Will I be charged once my credits are depleted or when the subscription ends?

Q: Does Amazon Bedrock store my prompts?

Q: How long would my trial credits last?

Warning