Replit DB limits?

SteveMallett · November 24, 2023, 1:02pm

**Is there a limit on replit db entries?
Screenshot 2023-11-24 at 9.00.34 AM
**

I’ve been trying to add numerous db entries, but the db seems to hit a limit at 5000 entries then writes fail.

I have a Core acct.

Spcfork · November 24, 2023, 1:13pm

This should be way bigger,
Especially when you DON’T want a SQL based structure.

As a Core user myself, I’d prefer a larger ReplitDB than a Postgres.

Personally don’t like making table structures,
But if it’s the way it is…

SteveMallett · November 24, 2023, 1:49pm

There seem to be many of these unknown?? snags. If replit has churn issues… I can see why.

Off the top…
disconnected/connecting issues (‘kill 1’ in shell if you’re experiencing this - works for me sometimes)
selenium is a bit of a nightmare to use here
replit db limit

SteveMallett · November 24, 2023, 4:24pm

adding limited AI use per month. Drive me to chatgpt & why am I here again?

Spcfork · November 24, 2023, 4:25pm

Lmao, true.

The AI usage limit is kinda tiny.

NateDhaliwal · November 25, 2023, 1:34am

Actually, ReplitDB has a 50MB storage. I believe it is recorded by the size of each item, and not the number of keys. So the bigger the items, the more storage it takes up.
So as your items are pretty short, it can store 5000.

Maybe the storage can be upped for Core?

Firepup650 · November 25, 2023, 3:34am

Actually, there is both a maximum key limit, and a maximum size limit. 50MB size, and 5000 keys are the upper limits.

SteveMallett · November 26, 2023, 6:30pm

are there limits on postgres as well?

Firepup650 · November 26, 2023, 6:31pm

I’m not sure. I’ve never used the postgres DB.

jsalsman · December 7, 2023, 10:00pm

Replitdb would be so much more useful for me with 50,000 instead of 5,000 keys. Who can we petition? I can’t imagine this is a substantial source of scarce resource utilization. Computers are really good at hashing keys into very big hash tables.

7heMech · December 7, 2023, 10:29pm

Well, replitdb is pretty much an unmaintained project I’ve built libraries for it, but it’s a product which replit doesn’t care about because it doesn’t make money (I think), because as a library dev for it I’ve found and reported bugs with the API which got noticed, but not fixed, anyways I’d recommend making a single key which value is a JSON stringified object and use that because there is a decent size limitation for values (5MB I think) and if that’s not enough you can make 10 of these keys.

Firepup650 · December 7, 2023, 10:39pm

Like this one I PR’d?

github.com/replit/replit-py

[FIX] Prevent keys from starting with slashes

replit:master ← Firepup6500:master

opened 09:07PM - 11 Sep 23 UTC

Firepup6500

+76 -7

If a key starts with a slash, then it becomes undeletable and prevents database …purges from working properly as well. This prevents that from occuring by stripping slashes from the left of the key name. Why === At least two separate occurrences where users have accidentally set a key that starts with a slash, and therefore can no longer remove the key or cleared their database. What changed ============ Changed all references to `key` or `k` when setting a key-value pair to `keyStrip(key)` and `keyStrip(k)` respectively. Test plan ========= Attempted to set keys starting with slashes, then subsequently remove them and/or purge the database. Rollout ======= Prevents keys from being set with a slash in the future, which was is less breaking than the existing behavior (which is to not process the keys at all) - [x] This is fully backward and forward compatible

7heMech · December 7, 2023, 11:48pm

The root of the problem aren’t the slashes, but yeah, it’s an issue with the API itself not a specific package, I managed to get them to copy some of my code and fix the issues in the official JS package with a lot of pestering, but the backend needs fixes as well.

latinlens · December 8, 2023, 4:21pm

If this would work in your context, I think there is a way to have 50000 keys in a replit db. Just make one entry within the replit db a dictionary:

db['large_dataset'] = {}

And from there you should be able to add as many keys as you like to db[‘large_dataset’].

Proof of concept:

from replit import db
import copy

my_dict = {}

for item in range(0,50000):
  my_dict[str(item)] = 'value'

# my_dict is now a dict with 50000 keys.

db['large_dataset'] = copy.deepcopy(my_dict)

# copied to replit db

print(len(db['large_dataset']))

# Should contain 50000 keys.

jsalsman · December 8, 2023, 5:53pm

Thanks, that’s perfect. I just need to store it back after modification, which seems fast enough.

NuclearPasta0 · December 8, 2023, 6:31pm

For reference, the docs say:

What limits does Database have?
The limits are currently:
50 MiB per database (sum of keys and values)
5,000 keys per database
1024 bytes per key
5 MiB per value
There are rate limits that apply to all operations. You will receive an HTTP 429 if you exceed them. We recommend implementing an exponential backoff and retry to handle this case.

https://docs.replit.com/hosting/databases/replit-database

jsalsman · December 14, 2023, 2:11pm

After looking around, the best super-fast suitable free tier replacement for replitdb has got to be Google Cloud Datastore, at a gigabyte with 20,000 free writes and 50,000 free reads per day: Pricing | Datastore | Google Cloud

It’s pretty easy:

$ pip install google.cloud.datastore google-oauth

from collections import UserDict
from google.cloud import datastore
from google.oauth2 import service_account
from google.cloud.datastore.entity import Blob
import json
import os

class DatastoreDict(UserDict):
    def __init__(self, project_id, kind='kvstore'):
        super().__init__()
        self.kind = kind
        self.client = self._create_datastore_client(project_id)

    def _create_datastore_client(self, project_id):
        # Assuming the service account JSON is in an environment variable
        service_account_info = json.loads(os.environ.get("GOOGLE_SERVICE_ACCOUNT_KEY_JSON"))
        credentials = service_account.Credentials.from_service_account_info(service_account_info)
        return datastore.Client(project=project_id, credentials=credentials)

    def _get_entity_key(self, key):
        return self.client.key(self.kind, key)

    def __getitem__(self, key):
        entity_key = self._get_entity_key(key)
        entity = self.client.get(entity_key)
        if not entity:
            raise KeyError(key)
        return entity['value']

    def __setitem__(self, key, value):
        entity_key = self._get_entity_key(key)
        entity = datastore.Entity(key=entity_key)
        # Check if the value is bytes and store as Blob
        if isinstance(value, bytes):
            entity['value'] = Blob(value)
        else:
            entity['value'] = value
        self.client.put(entity)

    def __delitem__(self, key):
        entity_key = self._get_entity_key(key)
        self.client.delete(entity_key)

    def get(self, key, default=None):
        try:
            return self[key]
        except KeyError:
            return default

    def keys(self):
        query = self.client.query(kind=self.kind)
        return [entity.key.name for entity in query.fetch()]

    def prefix(self, string):
        return [key for key in self.keys() if key.startswith(string)]

# Example Usage
# db = DatastoreDict(project_id="YourGCPProjectID")
# db['test_key'] = 'test_value'
# print(db.get('test_key'))
# print(db.keys())
# print(db.prefix("test"))