This page shows you how to use the upsert operation to write records into an index namespace. If a record ID already exists, upsert overwrites the entire record. To update only part of a record, use the update operation instead.

When you have multiple records representing chunks of a single parent document, you can prefix each record’s ID with a reference to that parent. For more details, see Manage RAG documents.

Upsert records into the default namespace

When upserting records without specifying a namespace, the records are added to the default namespace ("").

Upsert records into non-default namespaces

Namespaces let you partition vectors within a single index. Although optional, they are a best practice for speeding up queries, which can be filtered by namespace, and for complying with multi-tenancy requirements. For example, the following upsert writes records into 2 distinct namespaces:

Check data freshness

Pinecone is eventually consistent, so there can be a slight delay before new or changed records are visible to queries. After adding, updating, or deleting records, use the describe_index_stats operation to check if the current record count matches the number of records you expect.

For pod-based indexes, keep in mind that if you have multiple replicas, they may not all become consistent at the same time.

Upsert records with metadata

You can attach metadata key-value pairs to records. This lets you then filter queries to retrieve only records that match the metadata filter. For more information, see Metadata Filtering.

Upsert records with sparse values

You can upsert records with sparse vector values alongside dense vector values. This allows you to perform hybrid search, or semantic and keyword search, in one query for more relevant results.

See Upsert sparse-dense vectors.

This feature is in public preview. Consider the current limitations and considerations for serverless indexes, and test thoroughly before using it in production.

Upsert records in batches

When upserting larger amounts of data, upsert records in batches of 100 or fewer over multiple upsert requests.

Example

Python
import random
import itertools
from pinecone import Pinecone

pc = Pinecone(api_key="YOUR_API_KEY")
index = pc.Index("pinecone-index")

def chunks(iterable, batch_size=100):
    """A helper function to break an iterable into chunks of size batch_size."""
    it = iter(iterable)
    chunk = tuple(itertools.islice(it, batch_size))
    while chunk:
        yield chunk
        chunk = tuple(itertools.islice(it, batch_size))

vector_dim = 128
vector_count = 10000

# Example generator that generates many (id, vector) pairs
example_data_generator = map(lambda i: (f'id-{i}', [random.random() for _ in range(vector_dim)]), range(vector_count))

# Upsert data with 100 vectors per upsert request
for ids_vectors_chunk in chunks(example_data_generator, batch_size=100):
    index.upsert(vectors=ids_vectors_chunk) 

Send upserts in parallel

Send multiple upserts in parallel to help increase throughput.

Standard Python client

Using the standard Python client, all vector operations block until the response has been received. However, they can be made asynchronous by setting pool_threads during client initialization and passing async_req to the upsert operation. For the batch upsert example above, this would be done as follows:

Python
import random
import itertools
from pinecone import Pinecone

# Initialize the client with pool_threads=30 (/reference/limits to 30 simultaneous requests)
pc = Pinecone(api_key="YOUR_API_KEY", pool_threads=30)
index = pc.Index("pinecone-index")

def chunks(iterable, batch_size=100):
    """A helper function to break an iterable into chunks of size batch_size."""
    it = iter(iterable)
    chunk = tuple(itertools.islice(it, batch_size))
    while chunk:
        yield chunk
        chunk = tuple(itertools.islice(it, batch_size))

vector_dim = 128
vector_count = 10000

example_data_generator = map(lambda i: (f'id-{i}', [random.random() for _ in range(vector_dim)]), range(vector_count))

# Upsert data with 100 vectors per upsert request asynchronously
# - Pass async_req=True to index.upsert()
with pc.Index('example-index', pool_threads=30) as index:
    # Send requests in parallel
    async_results = [
        index.upsert(vectors=ids_vectors_chunk, async_req=True)
        for ids_vectors_chunk in chunks(example_data_generator, batch_size=100)
    ]
    # Wait for and retrieve responses (this raises in case of error)
    [async_result.get() for async_result in async_results]

gRPC Python client

The gRPC version of the Python client can provide higher upsert speeds than the standard client. Through multiplexing, gRPC is able to handle large amounts of requests in parallel without slowing down the rest of the system (HoL blocking), unlike REST. Moreover, you can pass various retry strategies to the gRPC client, including exponential backoffs.

To install the gRPC version of the client:

Shell
pip3 install pinecone-client[grpc]

To use the gRPC client, import the pinecone.grpc subpackage and target an index as usual:

Python
from pinecone.grpc import PineconeGRPC as Pinecone

pc = Pinecone(api_key='YOUR_API_KEY')  # This is gRPC client aliased as "Pinecone"
index = pc.Index('example-index')

To launch multiple read and write requests in parallel, pass async_req to the upsert operation:

Python
def chunker(seq, batch_size):
  return (seq[pos:pos + batch_size] for pos in range(0, len(seq), batch_size))

async_results = [
  index.upsert(vectors=chunk, async_req=True)
  for chunk in chunker(data, batch_size=100)
]

# Wait for and retrieve responses (in case of error)
[async_result.result() for async_result in async_results]

It is possible to get write-throttled faster when upserting using the gRPC client. If you see this often, we recommend you use a backoff algorithm(e.g., exponential backoffs)
while upserting.

The syntax for upsert, query, fetch, and delete with the gRPC client remain the same as the standard client.

Troubleshoot index fullness errors

Serverless indexes automatically scale as needed.

However, pod-based indexes can run out of capacity. When that happens, upserting new records will fail with the following error:

console
Index is full, cannot accept data.

While a full pod-based index can still serve queries, you need to scale your index to accommodate more records.

Was this page helpful?