Understanding cost

This page describes how costs are incurred in Pinecone for both serverless and pod-based indexes.

Serverless indexes

Serverless indexes are in public preview. Check current limitations and test thoroughly before using them in production.

With serverless indexes, you don’t configure or manage any compute or storage resources. Instead, based on a breakthough architecture, serverless indexes scale automatically based on usage, and you pay only for the amount of data stored and operations performed, based on three usage metrics:

Read units (RUs): Read units measure the resources consumed by read operations such as query, fetch, and list.
Write units (WUs): Write units measure the resources used by write operations such as upsert, update, and delete.
Storage: You’re billed for the size of an index on a per-GB rate, with the exact rate of each unit determined based on your cloud provider, region, and plan (see our pricing page).

Read units

Read operations consume read units (RUs). Read units measure the compute, I/O, and network resources used during the read process.

The following operations consume RUs:

The number of RUs used by a specific request is always included in its response. For a demonstration of how to use read units to inspect read costs, see this notebook.

Fetch

A fetch request uses 1 RU for every 10 fetched records.

# of fetched records	RU usage
10	1
50	5
107	11

Specifying a non-existent ID or adding the same ID more than once does not increase the number of RUs used. However, a fetch request will always use at least 1 RU.

Query

The number of RUs used by a query is proportional to the following factors:

Record count: The number of vectors contained in the target index. Only vectors stored in the relevant namespace are used.
Record size: Higher dimensionality or larger metadata increases the size of each scanned vector.

Because serverless indexes organize vectors in similarity-based clusters, only a fraction of each index will be read for each query. The number of RUs a query uses therefore increases much more slowly than the index size.

The following table contains the RU cost of a query at different namespace sizes and record dimensionality, assuming an average metadata size around 500 bytes:

Records per namespace	Dimension=384	Dimension=768	Dimension=1536
100,000	5 RUs	5 RUs	6 RUs
1,000,000	6 RUs	10 RUs	18 RUs
10,000,000	18 RUs	32 RUs	59 RUs

Scanning a namespace has a minimal cost of 5 RUs.

When either include_metadata or include_values are specified, an internal fetch call retrieves the full record values for the IDs returned in the initial scan. This stage consumes RUs equal to a matching fetch call - 1 RU per 10 records in the result set.

TopK value	Additional RUs used
TopK=5	1
TopK=10	1
TopK=50	5

List

List has a fixed cost of 1 RU per call, with an additional 1 RU per paginated call.

Write units

Write operations consume write units (WUs). Write units measure the storage and compute resources used to persist a record, make it available for querying, and update the clustered index to reflect its addition.

The following operations consume WUs:

Upsert

The number of WUs used by an upsert request is proportional to the total size of records it writes and/or modifies, with a minimum of 1 WU.

The following table contains the WU cost of an upsert request at different batch sizes and record dimensionality, assuming an average metadata size around 500 bytes:

Records per batch	Dimension=384	Dimension=768	Dimension=1536
1	3 WUs	4 WUs	7 WUs
10	30 WUs	40 WUs	70 WUs
100	300 WUs	400 WUs	700 WUs

Update

The number of WUs used when updating a record is proportional to the total size of the new or previous version of the record, whichever is larger, with a minimum of 1 WU.

The following table contains the WU cost of an update request at different dimensionalities and metadata sizes, with WUs based on the new or previous metadata size, whichever is larger:

Dimension	Previous metada size	New metadata size	WUs
768	400 bytes	500 bytes	4 WUs
1536	400 bytes	500 bytes	7 WUs
1536	4000 bytes	2000 bytes	11 WUs

Delete

The number of WUs used by a delete request is proportional to the total size of records it deletes, with a minimum of 1 WU.

The following table contains the WU cost of a delete request at different batch sizes and record dimensionality, assuming an average metadata size around 500 bytes:

Records per batch	Dimension=384	Dimension=768	Dimension=1536
1	3 WUs	4 WUs	7 WUs
10	30 WUs	40 WUs	70 WUs
100	300 WUs	400 WUs	700 WUs

Specifying a non-existent ID or adding the same ID more than once does not increase WU use.

Deleting an entire namespace using the deleteAll flag always consumes 1 WU.

Storage

The size of an index is defined as the total size of its vectors across all namespaces. The size of a single vector is defined as the sum of three components:

ID size
Embedding size (equal to 4 times the vector’s dimensions)
Total metadata size (equal to the total size of all metadata fields)

The following table demonstrates a typical index size at different vector counts and dimensionality:

Records per namespace	Dimension=384	Dimension=768	Dimension=1536
100,000	0.20 GB	0.35 GB	0.66 GB
1,000,000	2.00 GB	3.50 GB	6.60 GB
10,000,000	20.00 GB	35.00 GB	66.00 GB

Monitoring usage

Index-level monitoring: In the Pinecone console, you can track usage and performance metrics for each index.
Operation-level monitoring: The response to read operations like query, fetch, and list include the number of read units consumed.

Pod-based indexes

Cost calculation

For each pod-based index, billing is determined by the per-minute price per pod and the number of pods the index uses, regardless of index activity. The per-minute price varies by pod type, pod size, account plan, and cloud region.

Total cost depends on a combination of factors:

Pod type. Each pod type has different per-minute pricing.
Number of pods. This includes replicas, which duplicate pods.
Pod size. Larger pod sizes have proportionally higher costs per minute.
Total pod-minutes. This includes the total time each pod is running, starting at pod creation and rounded up to 15-minute increments.
Cloud provider. The cost per pod-type and pod-minute varies depending on the cloud provider you choose for your project.
Collection storage. Collections incur costs per GB of data per minute in storage, rounded up to 15-minute increments.
Plan. The free plan incurs no costs; the Standard or Enterprise plans incur different costs per pod-type, pod-minute, cloud provider, and collection storage.

The following equation calculates the total costs accrued over time:

(Number of pods) * (pod size) * (number of replicas) * (minutes pod exists) * (pod price per minute) 
+ (collection storage in GB) * (collection storage time in minutes) * (collection storage price per GB per minute)

To see a calculation of your current usage and costs, see the usage dashboard in the Pinecone console.

Pricing

For pod-based index pricing rates, see our pricing page.

Example

While our pricing page lists rates on an hourly basis for ease of comparison, this example lists prices per minute, as this is how Pinecone calculates billing.

An example application has the following requirements:

1,000,000 vectors with 1536 dimensions
150 queries per second with top_k = 10
Deployment in an EU region
Ability to store 1GB of inactive vectors

Based on these requirements, the organization chooses to configure the project to use the Standard billing plan to host one p1.x2 pod with three replicas and a collection containing 1 GB of data. This project runs continuously for the month of January on the Standard plan. The components of the total cost for this example are given in Table 1 below:

Table 1: Example billing components

Billing component	Value
Number of pods	1
Number of replicas	3
Pod size	x2
Total pod count	6
Minutes in January	44,640
Pod-minutes (pods * minutes)	267,840
Pod price per minute	$0.0012
Collection storage	1 GB
Collection storage minutes	44,640
Price per storage minute	$0.00000056

The invoice for this example is given in Table 2 below:

Table 2: Example invoice

Product	Quantity	Price per unit	Charge
Collections	44,640	$0.00000056	$0.025
P2 Pods (AWS)	0		$0.00
P2 Pods (GCP)	0		$0.00
S1 Pods	0		$0.00
P1 Pods	267,840	$0.0012	$514.29

Amount due $514.54

Cost controls

Pinecone offers tools to help you understand and control your costs.

Monitoring usage. You can use the usage dashboard in the Pinecone console to monitor your Pinecone usage and costs as these accrue.
Pod limits. For pod-based indexes, project owners can set limits for the total number of pods across all indexes in the project. The default pod limit is 5.

Changelog

Understanding cost

Serverless indexes

Read units

Fetch

Query

List

Write units

Upsert

Update

Delete

Storage

Monitoring usage

Pod-based indexes

Cost calculation

Pricing

Example

Cost controls

See also

Changelog

​Serverless indexes

​Read units

​Fetch

​Query

​List

​Write units

​Upsert

​Update

​Delete

​Storage

​Monitoring usage

​Pod-based indexes

​Cost calculation

​Pricing

​Example

​Cost controls

​See also

Serverless indexes

Read units

Fetch

Query

List

Write units

Upsert

Update

Delete

Storage

Monitoring usage

Pod-based indexes

Cost calculation

Pricing

Example

Cost controls

See also