Replit DB limits?

**Is there a limit on replit db entries?
Screenshot 2023-11-24 at 9.00.34 AM
**

I’ve been trying to add numerous db entries, but the db seems to hit a limit at 5000 entries then writes fail.

I have a Core acct.

3 Likes

This should be way bigger,
Especially when you DON’T want a SQL based structure.

As a Core user myself, I’d prefer a larger ReplitDB than a Postgres.

Personally don’t like making table structures,
But if it’s the way it is…

2 Likes

There seem to be many of these unknown?? snags. If replit has churn issues… I can see why.

Off the top…
disconnected/connecting issues (‘kill 1’ in shell if you’re experiencing this - works for me sometimes)
selenium is a bit of a nightmare to use here
replit db limit

3 Likes

adding limited AI use per month. Drive me to chatgpt & why am I here again?

Lmao, true.

The AI usage limit is kinda tiny.

2 Likes

Actually, ReplitDB has a 50MB storage. I believe it is recorded by the size of each item, and not the number of keys. So the bigger the items, the more storage it takes up.
So as your items are pretty short, it can store 5000.


Maybe the storage can be upped for Core?

2 Likes

Actually, there is both a maximum key limit, and a maximum size limit. 50MB size, and 5000 keys are the upper limits.

4 Likes

are there limits on postgres as well?

I’m not sure. I’ve never used the postgres DB.

Replitdb would be so much more useful for me with 50,000 instead of 5,000 keys. Who can we petition? I can’t imagine this is a substantial source of scarce resource utilization. Computers are really good at hashing keys into very big hash tables.

1 Like

Well, replitdb is pretty much an unmaintained project I’ve built libraries for it, but it’s a product which replit doesn’t care about because it doesn’t make money (I think), because as a library dev for it I’ve found and reported bugs with the API which got noticed, but not fixed, anyways I’d recommend making a single key which value is a JSON stringified object and use that because there is a decent size limitation for values (5MB I think) and if that’s not enough you can make 10 of these keys.

2 Likes

Like this one I PR’d?

3 Likes

The root of the problem aren’t the slashes, but yeah, it’s an issue with the API itself not a specific package, I managed to get them to copy some of my code and fix the issues in the official JS package with a lot of pestering, but the backend needs fixes as well.

1 Like

If this would work in your context, I think there is a way to have 50000 keys in a replit db. Just make one entry within the replit db a dictionary:

db['large_dataset'] = {}

And from there you should be able to add as many keys as you like to db[‘large_dataset’].

Proof of concept:

from replit import db
import copy

my_dict = {}

for item in range(0,50000):
  my_dict[str(item)] = 'value'

# my_dict is now a dict with 50000 keys.

db['large_dataset'] = copy.deepcopy(my_dict)

# copied to replit db

print(len(db['large_dataset']))

# Should contain 50000 keys.

3 Likes

Thanks, that’s perfect. I just need to store it back after modification, which seems fast enough.

For reference, the docs say:

What limits does Database have?
The limits are currently:
50 MiB per database (sum of keys and values)
5,000 keys per database
1024 bytes per key
5 MiB per value
There are rate limits that apply to all operations. You will receive an HTTP 429 if you exceed them. We recommend implementing an exponential backoff and retry to handle this case.

https://docs.replit.com/hosting/databases/replit-database

3 Likes

After looking around, the best super-fast suitable free tier replacement for replitdb has got to be Google Cloud Datastore, at a gigabyte with 20,000 free writes and 50,000 free reads per day: Pricing  |  Datastore  |  Google Cloud

It’s pretty easy:

$ pip install google.cloud.datastore google-oauth

from collections import UserDict
from google.cloud import datastore
from google.oauth2 import service_account
from google.cloud.datastore.entity import Blob
import json
import os

class DatastoreDict(UserDict):
    def __init__(self, project_id, kind='kvstore'):
        super().__init__()
        self.kind = kind
        self.client = self._create_datastore_client(project_id)

    def _create_datastore_client(self, project_id):
        # Assuming the service account JSON is in an environment variable
        service_account_info = json.loads(os.environ.get("GOOGLE_SERVICE_ACCOUNT_KEY_JSON"))
        credentials = service_account.Credentials.from_service_account_info(service_account_info)
        return datastore.Client(project=project_id, credentials=credentials)

    def _get_entity_key(self, key):
        return self.client.key(self.kind, key)

    def __getitem__(self, key):
        entity_key = self._get_entity_key(key)
        entity = self.client.get(entity_key)
        if not entity:
            raise KeyError(key)
        return entity['value']

    def __setitem__(self, key, value):
        entity_key = self._get_entity_key(key)
        entity = datastore.Entity(key=entity_key)
        # Check if the value is bytes and store as Blob
        if isinstance(value, bytes):
            entity['value'] = Blob(value)
        else:
            entity['value'] = value
        self.client.put(entity)

    def __delitem__(self, key):
        entity_key = self._get_entity_key(key)
        self.client.delete(entity_key)

    def get(self, key, default=None):
        try:
            return self[key]
        except KeyError:
            return default

    def keys(self):
        query = self.client.query(kind=self.kind)
        return [entity.key.name for entity in query.fetch()]

    def prefix(self, string):
        return [key for key in self.keys() if key.startswith(string)]

# Example Usage
# db = DatastoreDict(project_id="YourGCPProjectID")
# db['test_key'] = 'test_value'
# print(db.get('test_key'))
# print(db.keys())
# print(db.prefix("test"))
2 Likes